Skip to main content
Fig. 1 | BMC Medical Research Methodology

Fig. 1

From: A comparative study of forest methods for time-to-event data: variable selection and predictive performance

Fig. 1

Correct variable selection frequency for datasets A-D with RSF, CIF and MSR-RF. In subplots A-B, sample size was fixed at 200. Dataset A was set in a linear form, whereas dataset B was set in an interaction term. The unordered-categorical covariate associated with the outcome was (A1, B1) covariate with 2 categories; (A2, B2) covariate with 4 categories; (A3, B3) covariate with 8 categories. Dataset C was set in a linear form with all ten variables generated form the multivariate normal distribution MVΝ(0, Σ). The subplot C(ρ) was fixed at N = 100 and 25% censoring; C(N) was fixed at ρ=0 and 25% censoring; C (censoring) was fixed at N = 100 and ρ=0. Dataset D was set in a linear form with all variables generated from the standard normal distribution for D1 and the binomial distribution with 0.5 probability for D2. The subplots D were fixed at N = 100 and 25% censoring with the various ratio M/N, which means the ratio of the number of covariates M to the sample size N

Back to article page