BMC Medical Research Methodology

Table 2 Simulation study results

From: Variable selection in social-environmental data: sparse regression and tree ensemble machine learning approaches

A. Binary outcome	Strict			Relaxed^a
A. Binary outcome	TP (N/10)	FP (N/990)	F2	TP (N/10)	FP (N/953)	F2
UNIV-BFN	4.09	32.49	0.267	5.13	12.70	0.443
LASSO-1SE	3.84	6.05	0.383	5.53	3.71	0.559
LASSO-MIN	4.25	9.01	0.399	5.98	6.53	0.569
ELNET-1SE	5.26	20.51	0.405	6.21	9.33	0.560
ELNET-MIN	5.53	26.11	0.393	6.61	14.40	0.548
HCLST-CORR-SGL	5.40	17.91	0.428	6.39	7.09	0.597
HCLST-BOOT-SGL	5.20	16.66	0.420	6.35	7.07	0.594
RF	3.53	18.41	0.281	4.91	7.68	0.462
BAGGING	3.56	13.94	0.308	4.73	6.70	0.456
BART-LOCAL	4.68	15.66	0.387	6.32	7.13	0.591
BART-GLOBALSE	1.96	0.53	0.228	2.24	0.22	0.261
BART-GLOBALMAX	0.01	0.00	0.001	0.01	0.00	0.001
B. Continuous outcome	Strict			Relaxed^a
B. Continuous outcome	TP (N/10)	FP (N/990)	F2	TP (N/10)	FP (N/953)	F2
UNIV-BFN	4.83	39.57	0.286	5.90	17.47	0.468
LASSO-1SE	2.88	4.49	0.298	4.33	2.42	0.454
LASSO-MIN	4.61	10.52	0.419	6.47	7.60	0.599
ELNET-1SE	3.88	8.27	0.366	4.87	3.29	0.500
ELNET-MIN	5.18	14.79	0.433	6.61	8.88	0.598
HCLST-CORR-SGL	5.82	19.86	0.444	6.81	8.51	0.617
HCLST-BOOT-SGL	5.52	17.03	0.441	6.72	8.45	0.610
RF	4.63	28.46	0.316	5.93	14.26	0.493
BAGGING	4.40	25.73	0.314	5.81	12.96	0.494
BART-LOCAL	5.14	18.35	0.404	6.80	8.34	0.617
BART-GLOBALSE	2.40	0.93	0.274	2.85	0.41	0.326
BART-GLOBALMAX	0.01	0.00	0.002	0.01	0.00	0.002

TP True positive, FP False positive; boldface denotes best performing method by F2 statistic
^aUnder the relaxed definition, if a true variable or its surrogate was selected, that variable was considered to be identified. Surrogates are therefore no longer in the pool of potential false positives, but the maximum number of true positive variables remains 10

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com