Skip to main content

Table 2 Simulation study results

From: Variable selection in social-environmental data: sparse regression and tree ensemble machine learning approaches

A. Binary outcome

Strict

Relaxeda

TP (N/10)

FP (N/990)

F2

TP (N/10)

FP (N/953)

F2

UNIV-BFN

4.09

32.49

0.267

5.13

12.70

0.443

LASSO-1SE

3.84

6.05

0.383

5.53

3.71

0.559

LASSO-MIN

4.25

9.01

0.399

5.98

6.53

0.569

ELNET-1SE

5.26

20.51

0.405

6.21

9.33

0.560

ELNET-MIN

5.53

26.11

0.393

6.61

14.40

0.548

HCLST-CORR-SGL

5.40

17.91

0.428

6.39

7.09

0.597

HCLST-BOOT-SGL

5.20

16.66

0.420

6.35

7.07

0.594

RF

3.53

18.41

0.281

4.91

7.68

0.462

BAGGING

3.56

13.94

0.308

4.73

6.70

0.456

BART-LOCAL

4.68

15.66

0.387

6.32

7.13

0.591

BART-GLOBALSE

1.96

0.53

0.228

2.24

0.22

0.261

BART-GLOBALMAX

0.01

0.00

0.001

0.01

0.00

0.001

B. Continuous outcome

Strict

Relaxeda

TP (N/10)

FP (N/990)

F2

TP (N/10)

FP (N/953)

F2

UNIV-BFN

4.83

39.57

0.286

5.90

17.47

0.468

LASSO-1SE

2.88

4.49

0.298

4.33

2.42

0.454

LASSO-MIN

4.61

10.52

0.419

6.47

7.60

0.599

ELNET-1SE

3.88

8.27

0.366

4.87

3.29

0.500

ELNET-MIN

5.18

14.79

0.433

6.61

8.88

0.598

HCLST-CORR-SGL

5.82

19.86

0.444

6.81

8.51

0.617

HCLST-BOOT-SGL

5.52

17.03

0.441

6.72

8.45

0.610

RF

4.63

28.46

0.316

5.93

14.26

0.493

BAGGING

4.40

25.73

0.314

5.81

12.96

0.494

BART-LOCAL

5.14

18.35

0.404

6.80

8.34

0.617

BART-GLOBALSE

2.40

0.93

0.274

2.85

0.41

0.326

BART-GLOBALMAX

0.01

0.00

0.002

0.01

0.00

0.002

  1. TP True positive, FP False positive; boldface denotes best performing method by F2 statistic
  2. aUnder the relaxed definition, if a true variable or its surrogate was selected, that variable was considered to be identified. Surrogates are therefore no longer in the pool of potential false positives, but the maximum number of true positive variables remains 10