 Research
 Open Access
 Published:
Joint latent class model: Simulation study of model properties and application to amyotrophic lateral sclerosis disease
BMC Medical Research Methodology volume 21, Article number: 198 (2021)
Abstract
Background
In many clinical applications, evolution of a longitudinal marker is censored by an event occurrence, and, symmetrically, event occurrence can be influenced by the longitudinal marker evolution. In such frameworks joint modeling is of high interest. The Joint Latent Class Model (JLCM) allows to stratify the population into groups (classes) of patients that are homogeneous both with respect to the evolution of a longitudinal marker and to the occurrence of an event; this model is widely employed in reallife applications. However, the finite samplesize properties of this model remain poorly explored.
Methods
In the present paper, a simulation study is carried out to assess the impact of the number of individuals, of the censoring rate and of the degree of class separation on the finite sample size properties of the JLCM. A reallife application from the neurology domain is also presented. This study assesses the precision of class membership prediction and the impact of covariates omission on the model parameter estimates.
Results
Simulation study reveals some departures from normality of the model for survival submodel parameters. The censoring rate and the number of individuals impact the relative bias of parameters, especially when the classes are weakly distinguished. In realdata application the observed heterogeneity on individual profiles in terms of a longitudinal marker evolution and of the event occurrence remains after adjusting to clinically relevant and available covariates;
Conclusion
The JLCM properties have been evaluated. We have illustrated the discovery in practice and highlights the usefulness of the joint models with latent classes in this kind of data even with prespecified factors. We made some recommendations for the use of this model and for future research.
Background
Joint models for longitudinal and timetoevent data are now widespread due to large cohort studies allowing collection of repeated measures of biomarkers and clinical events times [1]. The most popular way to analyze this kind of combined data are the shared random effects models, proposed by Wulfsohn and Tsiatis [2], where a function of random effects, issued from the model for longitudinal marker, is included as a covariate into the survival model. This approach allows to explain the relation between a longitudinal parameter and a timetoevent, assuming a homogeneous population. However, for certain diseases, the homogeneity assumption is not met and existence of different profiles of biomarker progression and/or of the time toevent should be accounted for in the model.
Mixture models are widely used in medical research. Different extensions allowing to account for the potential heterogeneity in population were proposed. Verbeke and Lesaffre [3] extended the mixture model to longitudinal data, assuming a latent profile of the biomarker progression (growth mixture model GMM). Muthén and Shedden [4] jointly studied longitudinal data with a binary outcome. Lin et al. [5] developed the joint latent class model (JLCM) replacing the binary outcome by a timetoevent. The JLCM allows firstly to account for the dependency between a longitudinal biomarker and a timetoevent by distinguishing between different profiles of biomarker progression associated with the risk of event. Secondly, it allows to analyze different profiles of longitudinal biomarker process censored by the event occurrence. Finally, the JLCM provides predictions for the risk of event conditional on the biomarker progression.
Very flexible, the JLCM remains quite complex. Indeed, it is composed of 3 submodels (a multinomial logistic regression for latent classes, a linear mixed model for longitudinal process and a survival model for the timetoevent) and each of these submodels can include covariates with effects specific or common to the latent classes.
To our knowledge, very few papers deal with studying the properties and the behaviour of the JLCM, for example ProustLima et al. [6], therefore it is rarely used in published clinical studies. Using a literature search of MEDLINE and WOS until december 2020, we found only 8 medical papers published since the model development in 2002 [5]. These papers appeared following a comprehensive methodology review concerning the JLCM [7] and have different objectives. These objectives can be summarized as follows: 1) to study the relationship between a longitudinal biomarker and the risk of event [8–11]; 2) to identify subgroups of longitudinal biomarker progression censored by the event occurrence [12]; 3) to study the impact of different factors on the longitudinal biomarker progression censored by the event occurrence [13]; 4) to predict the risk of an event based on the longitudinal biomarker progression [14, 15]. Different implementations of the model were proposed to achieve a same objective. For example, for the first objective, Syrjälä et al. [8] search for the relation between childhood food consumption and the risk of advanced islet autoimmunity using a JLCM without covariates; Brilleman et al. [9] explore the relationship between the changes in body mass index and the risk of death and/or transplant in hemodialysis patients by means of the JLCM for competing risks, including the prespecified covariates with a common effect on latent classes only in the survival submodel; Ogata et al. [10] and Portegies et al. [11] analyze the association between fasting plasma glucose progression and the risk of cardiovascular disease and the association between the blood pressure trajectories and the risk of stroke respectively by including the prespecified covariates with a latent classspecific effect into the linear mixed submodel and into the survival submodel. As other examples, for the fourth objective, [14] search to prevent Alzeihmer disease using MMSE (MiniMental State Examination) score progression and creating a predictive risk model with classspecific covariates in both linear mixed submodel and in the survival submodel; Stamenic et al. [15] defined latent classes to assess the impact of serum creatinine on graft failure risk with no covariates in JLCM, and performed a multivariable multinomial logistic analysis after defining these latent classes in order to analyze the factors associated to the classes.
A few simulation studies concerning the JLCM and its extensions (competing risks, interval censoring, multistate survival submodel) were carried out [6, 16–18]. However, these simulations focus on the model usability and aim at validating the estimation procedure rather than exploring the general properties of the model and its finitesample properties.
Thus the usage of the model is heterogeneous and its properties in terms of sample size and censoring rate are not comprehensively studied.
In this context, the objective of this paper is to empirically, by a simulation study, explore the asymptotic properties of the JLCM model, namely, the impact of the censoring rate and of the number of individuals on bias and normality of parameter estimates as well as on the quality of latent class identification. A real data application will also be carried out. Within this application, the impact of covariates omission and inclusion in the model on estimations and class membership prediction will be investigated.
Methods
Joint latent class model
The joint latent class model is composed of three submodels: a multinomial logistic regression defining the probability of belonging to a latent class, a mixed linear model for each latent class describing the evolution of the longitudinal marker, and a survival model accounting for the timetoevent for each class. The submodels are detailed as follows.

The multinomial logistic regression is defined by π_{i}g, the probability of individual i to belong to a given latent class g, conditional on a covariate vector X_{i}:
$${} \pi_{ig}=P({ c_{i}}=g\pmb{X}_{i})=\frac{\mathrm{e}^{\xi_{0g}+{\pmb{X}_{i}^{T}}\pmb{\xi}_{1g}}}{\sum_{l=1}^{{G}}\mathrm{e}^{\xi_{0l}+{\pmb{X}_{i}^{T}}\pmb{\xi}_{1l}}}, $$(1)where c_{i} is the latent class for patient i, \(c_{i} \in (1, \cdots, G), \pmb {X}_{i}^{T}\) is a vector of explanatory variables for i necessarily independent of time, ξ_{1g} the vector of coefficients associated to the covariates effects within class g. Note that ξ_{0G}=0 and ξ_{1G}=0 to assure the model identifiability. If no prior information about the latent class is available, it is possible to use the marginal probability of the class g, \( \frac {e^{\pmb {\xi }_{0g}}}{\sum _{l=1}^{G}e^{\pmb {\xi }_{0l}}}\) in Eq. (1).

The mixed linear model for a trajectory of a longitudinal marker of an individual i over time points t_{ij},Y_{ij} in a latent class g is defined as:
$${}Y_{ij}(c_{i}=g)={\pmb{X}_{1ij}^{T}}\pmb{\gamma}+{\pmb{X}_{2ij}^{T}}\pmb{\beta}_{g}+{\pmb{Z}_{ij}^{T}}\pmb{b}_{ig}+\epsilon_{ij}, $$(2)where \(\pmb {X}_{1ij}^{T}\) is the vector of explanatory variables common to all latent classes and γ the corresponding vector of coefficients, \(\pmb {X}_{2ij}^{T}\) is the vector of classspecific explanatory variables with β_{g} the corresponding vector of coefficients, and Z_{ij} is the vector of explanatory variables associated with the random effects \(\pmb {b}_{ig} \sim \mathcal {N} (\pmb {\mu }_{g}, \pmb {B}_{g})\) (μ_{g} is a mean of random effects, B_{g} is a variancecovariance matrix of random effects, both of which can be common or specific to latent classes). Note that \(\pmb {X}_{1ij}^{T}\) and \(\pmb {X}_{2ij}^{T}\) have no variables in common.

The survival model for an individual i over time is defined by its hazard function, α_{i}(t), within each latent class as:
$$ \alpha_{i} (t)\left(c_{i}=g\right) = {\alpha_{0}}\left(t,\pmb{\zeta}_{g}\right)\exp\left(\pmb{X}_{1i}^{T}\pmb{\vartheta}+{\pmb{X}_{2i}^{T}}\pmb{\eta}_{g}\right) $$(3)with α_{0}(·) the baseline risk function in latent class g, parametrized by vector \(\pmb {\zeta }_{g}, \pmb {X}_{1i}^{T}\) is the vector of explanatory variables and 𝜗 the associated parameters common to all latent classes, \(\pmb {X}_{2i}^{T}\) is the vector of classspecific explanatory variables and η_{g} the corresponding classspecific parameters of the model.
We denote by T_{i} the observed time to a clinical event of interest for individual i. In the framework of JLCM, it is important to note that the measures of the longitudinal marker after T_{i}, if there exist, are excluded from the observed data. Indeed, the objective is to describe the link between the risk of the event and the marker change over time preceding the event. The observed duration \(T_{i}= \min (T^{\star }_{i},C_{i})\), where \(T^{\star }_{i}\) corresponds to the real timetoevent (possibly not observed) and C_{i} corresponds to the rightcensored duration. The survival function corresponding to the hazard of Eq. (3), is defined as:
$${} S(t)=\exp \left(\int_{0}^{t} \alpha(u) du\right) $$(4)
Note that the individual covariate vectors \(\pmb {X}_{i}^{T}\) can be different in each of the three submodels (Eqs. (1)(3)), but have same notations for simplicity.
Likelihood
The parameters of the model can be estimated by the maximum likelihood method. The loglikelihood of the model defined for G latent classes is defined by Commenges and JacqminGadda [19] as:
where π_{ig} is the probability of belonging to class g (Eq. (1)), \(\phantom {\dot {i}\!}{f_{\pmb {y}_{i}c_{i}}(\pmb {Y_{i}}c_{i}=g)}\) is the probability density function of the longitudinal marker data in class g, defined in Eq. (2), α_{i}(T_{i}c_{i}=g) is the hazard function defined in Eq. (3), S_{i}(T_{i}c_{i}=g) is the corresponding survival function. The event indicator δ_{i} for each individual is defined as:
The model parameters are estimated using the maximum likelihood estimator (MLE); the loglikelihood function is maximized by NewtonRaphsonlike algorithm [20].
The optimal number of latent classes, G, is defined following Tofighi and Enders [21] by the BIC (Bayesian information criterion): the number of classes corresponding to the minimum value of BIC is preferred. However, the choice of G is also based on the number of patients per class and the concordance between the a posteriori classification derived from the model and expert opinion.
Class prediction and goodnessoffit
Model goodnessoffit can be assessed by a measure of class prediction accuracy. The class membership can be identified by computing the posterior probability of belonging to a class g for each subject, based on the estimated model parameters. This probability is conditional on the observed covariate vector, i.e. the longitudinal data Y and the event times T, and is defined in Eq. (7):
The subject i is assigned to a class g corresponding to the maximum estimated a posteriori probability π_{ig}.
Other approaches to goodnessoffit can be employed, in particular those based on different types of residuals corresponding to different submodels. These approaches will not be developed in the present paper.
Results
Simulation study
In the present study the properties of the JLCM are assessed by MonteCarlo simulations. Simulations focus on the general model properties, on the model robustness to the number of individuals and the number of events, and on the quality of class separation.
The general framework for the simulation study is presented below.
Simulations design
The simulations are carried out for different settings in terms of the number of individuals n, n={100,500,1000,5000}, and in terms of the censoring rate τ,τ={0.05,0.10,0.15,0.25,0.50}, allowing to explore both possible asymptotic directions: the number of individuals and number of observed events [22]. The capacity of the model to distinguish between the latent classes is investigated by considering two different settings in terms of class separation: high separation (the classes are very different in terms of longitudinal marker evolution) and low separation (the classes are quite similar). The censoring mechanism was independent from the event process and no covariates were included in simulated models. Given the complex likelihood function, the optimisation algorithm may not always converge. That’s why for each setting in terms of n, τ and class separation, 120 datasets were generated to assure obtaining at least 100 results in each setting. The distribution of each of the estimated parameters was then analyzed in terms of normality, relative bias and coverage rate. The normality was assessed graphically by quantilequantile plots. Indeed, normality tests would often reject the null hypothesis due to outliers in parameter estimations (this situation is probable due to the likelihood complexity; it results in local maxima, but is rare in practice) and/or to high test power. The relative bias for a parameter θ is calculated as:
with \(\frac {1}{K}\sum _{h=1}^{K}\hat {\theta }_{n,h}\) the average parameter estimation from the sample of n individuals over K MonteCarlo runs, and θ the real parameter value. The absolute value will be considered.
The coverage rate was calculated for each model parameter as the percentage of coverage of the real value by the estimated confidence interval.
The capacity of the model to distinguish the latent classes is assessed by the percentage of correctly predicted class memberships.
Data generation
The real parameters were chosen to mimic the real data, described in Stamenic et al. paper [15], dealing with a prognostic tool for individualized prediction of graft failure risk within ten years after kidney transplantation, using serum creatinine progression as a longitudinal marker. Following Eqs. (1  4), the generated data were governed by the following general model:
M(t) being the survival function of the censoring distribution and C the censoring time. Note that the fact that there is no covariate in logistic model for class membership implies constant probability for each class membership. The considered longitudinal model is a random intercept mixed model and it implies that in Eq. (2), \(\pmb {X}_{1ij}^{T}\) is a zero matrix (no common covariates) and \(\pmb {X}_{2ij}^{T} = \left (1 \quad t_{ij}\right)\). The considered survival and censoring distributions imply that the survival and censoring times are Weibull random variables. The parameters of the censoring distribution were chosen empirically to meet the required censoring rate given the corresponding survival distribution. These nuisance parameters are not presented in the article.
The time points for repeated measures of the longitudinal marker are fixed to 1, 3, 6, 12, 18 and 24 months, following Stamenic et al. [15]. The parameters vector for a 2classes model, with class common random effect and error variance of mixed submodel is as follows:
The real values for the parameters were chosen as follows:

1
High separation framework.
This setting is directly derived from Stamenic et al. [15], resulted in β_{01}=170,β_{02}=100,β_{11}=88 by year, β_{12}=1.2 by year, \( \sigma ^{2}_{b,1}= \sigma ^{2}_{b,2}=50\) and \(\sigma ^{2}_{\epsilon,1}=\sigma ^{2}_{\epsilon,2}=60, \zeta _{11}=4.5, \zeta _{21}=2, \zeta _{12}=50, \zeta _{22}=1.01\).

2
Low separation framework.
In this setting the values of the mixed model from high separation are divided by 2 to obtain quiet similar classes in terms of longitudinal marker evolution; survival model as well as random parameters were not modified, resulting in β_{01}=135,β_{02}=100,β_{11}=44 by year, β_{12}=1.2 by year.
In both settings, the shape parameter for the Weibull distribution for censoring was fixed to 1.5, inspired from real life, where more censoring occurs with time. The scale parameter for this distribution was empirically derived to meet the required censoring rate. The probability of class 1 membership was set to 0.3 in both settings, resulting in the logistic model parameter from Eq. (1) ξ_{01}=−0.84. The examples of simulated trajectories for the high separation and low separation settings are illustrated in Fig. 1; the observed longitudinal trajectories are rather confounded in the low separation setting in comparison with the high separation.
Normality assessment
The normality of the estimated parameters is assessed by plotting quantilequantile plots for each setting in terms of classes, the number of individuals n and of the censoring rate τ.
Figure 2 illustrates the results for the mixed and the survival submodels, for 100 individuals, censoring rate 0.05 and 0.5 in the high separation setting. For small censoring rate (0.05) the normality of all the parameters is globally respected; heavy censoring (0.5) implies deviations from normality for the parameters of the survival submodel.
Similar trends are observed for the other settings in terms of n and τ (results not presented). Note that the normality of the longitudinal submodel parameters is not heavily impacted by small sample size and/or heavy censoring. Also, the MLE’s normality is not considerably influenced by the degree of class separation according to the present simulation study (results not presented). However, this conclusion should be considered with caution, since it can be different for different separation degrees.
As expected, departures from normality decrease with increasing number of individuals (see Fig. 3 for the Weibull scale and shape parameters, heavy censoring) regardless of heavy censoring. Note that most of papers dealing with asymptotic properties of survival models are focused on the regression coefficients.Very few papers focus on the Weibull distribution parameters. Sirvanci and Yang [23] derives the asymptotic normality of the Weibull model parameters for Type I censoring data (fixed length of followup). However, in our study, empirically the departures from normality are reported for small sample size in terms of the number of events and/or the number of individuals (simulation results not presented here); in this sense, the normality problem is not specific to the joint latent class model, but is rather inherited from survival analysis.
Relative bias assessment
The relative bias (RB) of classspecific parameters estimates is illustrated in Figs. 4 and 5 for the high separation setting and in Figs. 6 and 7 for the low separation setting. The detailed numerical results are provided in Tables 1 and 2 for the high and low separation settings respectively.
The general trends for the RB range and for its evolution according to the sample size and to the censoring rate depend on model parameter and on degree of class separation. Concerning the variance parameters (the variance of error and of the random effect in the mixed submodel) there is no clear trend in their RB evolution; the following trends are revealed for the remaining parameters:

As for the absolute values, in the high separation setting (Fig. 6 and Table 1), the RB is the most important for two parameters of class 2: 1) the survival submodel Weibull shape parameter (RB over 10% for small number of individuals) and 2) the mixed submodel slope parameter (RB varies from 10% to 120% depending on number of individuals and on the censoring rate, the mean number of longitudinal markers in the worse case (100 patients and a censor of 50%) is 5.1). For the remaining parameters the RB does not exceed 10%. The trend is quite similar for the low separation setting (Fig. 6 and Table 2), but to a higher extent: the RB varies from over 30% to 530% in the worst setting (small n and high τ).

As for the impact of the censoring rate, the RB increases linearly for a given number of individuals according to the decreasing number of events (increasing censoring rate). This trend is the same for both settings in terms of degree of class separation, but, in the same manner that the RB absolute values, in a higher extent for the low separation setting. Precisely, in the high separation setting the RB decreases by around 1% for the parameters of class 1 (28% in the low separation case) and for around 35% (215% in the low separation case) for the parameters of class 2, for the exception of the mixed model slope: 100% decrease in the RB in the high separation (respectively 400% in the low separation setting) for τ decreasing from 50% to 5%). Note that the linear trend for RB evolution in terms of τ is not always respected for small n.

As for the impact of the number of individuals, the increasing n does not seem to strongly impact the RB. Moreover, the Weibull shape parameter is more influenced than the Weibull scale. Also, the low separation setting is more influenced than the high separation setting.
Note that class 2 has the least number of patients with a lower risk of death; therefore the parameters of this class are more affected by the censoring rate. Also, the high bias for the class 2 slope parameter is explained by the small theoretical value for this parameter (β_{12}=1.2).
Coverage rate assessment
The coverage rate is globally satisfactory (refer to Tables 3 and 4 for the 95% coverage rates in the high and the low separation settings respectively). However, the large sample size in terms of the number of individuals results in smaller confidence intervals, entailing lower empirical coverage rate. This trend is especially visible for heavy censoring. Departures from normality already mentioned for these settings can also be a cause of this phenomenon.
Class membership prediction assessment
The quality of the class membership prediction is globally satisfactory (Table 5): it is over 90% for the majority of settings in terms of n and τ. However, this quality is globally weaker for the low separation setting (less than 95% comparing to a rate higher than 95% for the high separation setting) and for heavy censoring (8385% for the low separation setting, censoring rate 0.5). A decreasing censoring rate results in a 1% to 3% of the class identification improvement for all n, for the exception of heavy censoring τ. The sample size n does not considerably influence the quality of predictions, and in the low separation setting the prediction accuracy is around 36% weaker compared to the high separation setting, for the exception of heavy censoring cases.
The obtained simulations results can be summarized as follows: in general the MLE properties of the model parameters are impacted by the number of individuals as well as by the number of observed events and the number of longitudinal observations, which are both governed by the censoring rate. Note that the frequency of longitudinal marker observations also determines the number of observed measures, although this parameter is left fixed in the present study.
The quality of class membership identification depends on the number of observed events rather than on the number of observed individuals. The degree of class separation, determined by the classspecific slope of the longitudinal model, influences the bias and the normality of the MLE as well as the class identification accuracy. The assessment of the model properties was carried out after removing simulations with estimation convergence problems. The convergence problems are principally due to initial parameter values used in numerical estimation procedure. Such situations are quite rare : 1/120 (0.8%) for the setting n=100 in high separation case and 9/120 times (7.5%) for the setting n=100 in low separation case. Other settings were not impacted.
Real data application
In the present section, the analysis of the Amyotrophic Lateral Sclerosis (ALS) progression using a joint latent class model is presented.
ALS is a rapidly progressive and ultimately fatal neurodegenerative disease with an average life expectancy of 3–5 years from symptoms onset. However, longer than 10years survival has been reported in 5–10% of patients [24, 25]. Despite numerous clinical trials dealing with treatments aimed at survival increase, only riluzole exhibited moderate efficacy [26]. One of the reasons which can explain the negative results of clinical trials is a strong heterogeneity of ALS patients in terms of the disease progression. The disease progression is generally measured at specific time points, resulting in a longitudinal marker. In this context, the joint latent class model, allowing to capture the patients heterogeneity and to simultaneously account for a longitudinal marker and a survival time, is better suited to analyze the ALS data.
The objective of our application is twofold. Firstly, it is focused on capturing and describing the profiles of ALS patients in terms of the survival probability, the disease progression and clinical characteristics, described by covariates. Secondly, it aims at exploring the results in the light of model properties revealed by the simulation study.
Data collection
The data were collected in the framework of the Trophos prospective cohort study (TRO19622), a multicenter, randomized, placebo controlled, phase II/III clinical trial, which showed no efficacy of olesoxime in ALS [27]. The cohort consisted of 512 patients recruited across 15 European centres during the threeyears period (2009–2011). The study time scale is the time since inclusion. The mean age of patients was 56 (sd=11.2) years at inclusion and 55 (sd=11.2) years at symptoms onset, with 331 (64.6%) men and 181 (35.4%) women. The diagnosis was definite in 107 patients (20.9%) and probable in 404 patients (79.1%) [28]; 101 (19.8%) patients suffered from bulbar form. The disease duration spanned between 6 and 36 months. Patients were treated with 50mgriluzole twice a day for at least one month and had a baseline slow vital capacity (SVC) of 70%.
All patients were examined at inclusion and every 3 months thereafter for a maximum of 18 months for clinical, biochemical and hematological parameters. The diseasespecific functional rating scale, revised ALSFRS (ALSFRSR), was also assessed 1 month postinclusion and then every 3 months until 18 months maximum. Survival time was defined as the duration between the date of disease onset and the date of a composite endpoint: ALSrelated death, tracheotomy, beginning of the noninvasive positive pressure ventilation (NIPPV) over 23 hours per day for 14 consecutive days or the date when last known to be alive.
Model construction
In terms of class identification, from 1 to 4 latent classes were considered. A quadratic trend for the longitudinal marker evolution was specified, and the corresponding mixed model was specific to each class, meaning that the quadratic terms were eliminated if not significantly different from 0, leading to a linear trend. The model performance in terms of class identification was assessed by the BIC.
To assess the impact of the sample size on parameter estimations, the estimations were carried out for the whole sample (512 patients) and for a subset of 100 randomly chosen patients. The results from the reduced sample appeared to be slightly different (results not presented here), reflecting the potential bias, revealed by the simulation study.
To better understanding of latent classes, modeling with and without covariates was performed. The covariates were included into the survival and the mixed submodels, whereas the logistic regression, describing the probability of belonging to a class, was defined without covariates in all settings.

A model without covariates (Eq. 9) includes a randomintercept mixed model with a classspecific quadratic function of time specified for the longitudinal marker evolution Y_{ij}; the variances of the random effect (\(\sigma ^{2}_{b}\)) and of the error (\(\sigma ^{2}_{\epsilon }\)) were considered common to all classes. Survival curves are also considered as classspecific.
The originally intervalcensored survival times, collected at baseline and at months 1, 3, 6, 9, 12, 15 and 18, were imputed from a Weibull distribution of these intervalcensored dates to obtain the exact event times. The imputation was carried out in order to obtain the setting close to that used in simulations. Specifically, a Weibull distribution was first fitted to the intervalcensored dates, and then the exact event times were sampled from this distribution truncated by the limits of the observed intervals for each patient.
$$ \left\{\begin{array}{ll} \pi_{ig}=\frac{\mathrm{e}^{\xi_{0g}}}{\sum_{l=1}^{{G}}\mathrm{e}^{\xi_{0l}}} \text{, from Eq.~(1)}\\ \\ Y_{ij}(c_{i}=g)= \beta_{0g} + \beta_{1g} t_{ij} + \beta_{2g} t_{ij}^{2}+ b_{0i}+b_{1i} t_{ij}\\+b_{2i} t_{ij}^{2}+\epsilon_{i},& \\ \quad \quad \pmb{b}_{i} \sim \mathcal{N}\left(0, \pmb{B}\right), \epsilon_{ij} \sim \mathcal{N}\left(0, \sigma^{2}_{\epsilon}\right) \text{, from Eq.~(2)} \\ \\ S(t_{i})(c_{i}=g) = \exp\left(\left(\frac{t_{i}}{\zeta_{1g}}\right)^{\zeta_{2g}}\right),& \\ \quad \quad T^{\star}\sim \mathcal{W}eibull \left(\zeta_{1g}, \zeta_{2g}\right), \text{ from Eq.~(4)} \end{array} \right. $$(9)with B covariance matrix of random effects.

A model with covariates (Eq. 10, the hazard function is specified for easier interpretation) was specified based on clinical expertise and a preliminary unpublished study. This model includes baseline individual characteristics in the random intercept mixed submodel and in the survival submodel; the impact of these characteristics is specified common to all classes, following the clinical considerations. The quadratic term of time for the mixed submodel appeared to be not significantly different from 0 for this model and is thus removed. Baseline covariates and their interactions with time were as well chosen from clinical expertise.
The following abbreviations are used: AO (Age at onset), SO (Symptom Onset), BMI (Body Mass Index), MUSC (Muscular capacity), SVC (Slow vital capacity), MCV (Mean corposcular volume).
$$ {}\left\{ \begin{array}{lll} \pi_{ig}&=\frac{\mathrm{e}^{\xi_{0g}}}{\sum_{l=1}^{{G}}\mathrm{e}^{\xi_{0l}}} \text{, from Eq.~(1)}\\ \\ Y_{ij}(c_{i}=g)&= \beta_{0g} + \beta_{1g}t_{ij} + \gamma_{1}SO_{i}+ \gamma_{2}BMI_{i} + &\\& \gamma_{3}MUSC_{i}+ \gamma_{4}SVC_{i}+ \gamma_{5}MCV_{i} + &\\& t_{ij} \times \left(\gamma_{6}SO_{i}+ \gamma_{7}MUSC_{i} + \gamma_{8}SVC_{i} \right)+ &\\& b_{0i}+b_{1i} t_{ij}+\epsilon_{ij}, &\\& b_{i} \sim \mathcal{N}\left(0, \sigma^{2}_{b}\right), &\\& \epsilon_{ij} \sim \mathcal{N}\left(0, \sigma^{2}_{\epsilon}\right) \text{from Eq.~(2)} \\ \\ \alpha_{i}(t)(c_{i}=g) &=\underbrace{\zeta_{1g}^{\zeta_{2g}}\zeta_{2g} t^{\zeta_{2g}  1}}_{\alpha_{0} (t)} \exp(\vartheta_{1}{SO}_{i} +\vartheta_{2}{BMI}_{i} + &\\& \vartheta_{3}{MUSC}_{i} + \vartheta_{4}SVC_{i} + &\\&\vartheta_{5}AO_{i}) \text{, from Eq.~(3)} \\ \end{array} \right. $$(10)
Real data analysis results
According to the BIC, 4 latent classes were retained for the model without covariates (BIC=15110 for 1 latent class, 14974 for 2 classes, 14911 for 3 latent classes and 14901 for 4 latent classes)and 2 latent classes for the model with covariates (BIC=14517 for 1 latent class, 14408 for 2 classes, 14410 for 3 latent classes and 14420 for 4 latent classes). Estimation results are presented in Table 6 and in Table 7 for the two models respectively. Models without and with covariates using the complete cases sample included 511 and 497 patients respectively. The difference in the number of patients is caused by missing covariates. Estimated survival curves and predicted ALSFRS evolution profiles are illustrated in Figs. 8 and 9 for the two considered models respectively.
The resulting latent classes are quite distinct both for the 4classes no covariate model and for the 2classes model including the covariates. The classes are characterized by a degree of ALSFRS decline and by the survival probability: a more rapid ALSFRS evolution is associated to a worse survival prognosis (refer to Figs. 8 and 9). In particular, the latent classes identified within the model without covariates can be interpreted in the following manner (refer to Fig. 8 for illustration).

Classes 1 and 4 from the model without covariates are each composed of 5.1% of population. They represent patients with the most rapid decrease of ALSFRS and the highest risk of death, with a median survival around 7 months and 14 months for class 1 and 4 respectively.

Class 2 is the largest (68.5% of patients) and is characterized by the slowest evolution of ALSFRS and the highest survival rate (median survival over 20 months).

Class 3 is composed of 21.3% of population and represents an “average” class with an ALSFRS progression similar to that in class 1 but with a lower baseline value: from Table 6 we observe the baseline value of 37 in class 3 vs 39 for class 2. The survival probability in class 3 is lower than that in class 2, with a median survival around 15 months.
The latent classes identified within the model with covariates can be interpreted in the following manner (refer to Fig. 9 for illustration).

Class 1 is the largest (92.6% of patients), is characterized by a moderate ALSFRS progression (2.3 point by months) and by a better survival prognosis (over 20 months median survival compared to around 8 months for class 2, for a patient with the average covariates vector).

Class 2 is composed only of 37 patients (7.4%) and describes a specific patient profile, worsening and dying very quickly.
Note that after adjustment on the prespecified factors from literature, known to be associated to ALSFRS progression and survival, two latent patient profiles are identified by the model, indicating a lack of explanatory capacity of these factors and motivating the use of the latent class model. This remaining latency in the model with covariates confirms the interest of using the JLCM to analyze this kind of data, and suggests a need for further clinical analysis of the disease progression.
Discussion
Several general considerations and recommendations concerning the use of the joint latent class model can be derived from the results of simulations.
To summarize, the departures from normality are particularly present for the survival submodel parameters, and these departures disappear for a large enough number of observed events (small censoring rate) and/or large enough sample size (from 500 individuals normality is generally respected even for heavy censoring).
In terms of the relative bias, the trends are more complex. The parameters of the survival submodel are also more impacted, especially for a small sample size n. The large number of individuals does not compensate for heavy censoring, as it was the case for normality. There is no particular trend in terms of n, except for the survival submodel parameters, whose bias is considerably increased for n=100. The bias decreases quasi linearly for almost all parameters with increasing number of observed events (decreasing censoring rate). The estimations in the low separation case are less robust to the sample size and to the censoring rate than in the high separation case.
Finally, the class identification accuracy is slightly higher for the high separation setting and for smaller censoring, but not considerably influenced by the number of individuals, except for the case of heavy censoring; in the low separation setting the class identification accuracy is quite poor.
In the light of the obtained results, several remarks can be formulated concerning the general model usability.
Concerning implementation, the low separation setting, i.e., the small difference in the longitudinal model slopes, the likelihood optimization procedure is more likely to converge to a local maxima. Thus, several estimations with different initial parameter values should be carried out to assure that the obtained estimation is the global maxima.
Concerning the general model properties, the following should be accounted for.

1
Small sample size in terms of the number of individuals results in deviations from normality, especially for the survival model parameters. The provided confidence intervals may not be valid.

2
Heavy censoring implies bias in parameter estimation, especially in case of weak separation between latent classes. This bias is not compensated by large sample size.

3
Heavy censoring gives poor class identification accuracy, especially in the case of weak separation between latent classes.

4
The model parameters are generally more sensible to censoring rate than to the number of individuals in terms of bias, thus, increasing the time of observation is more beneficial for the accuracy of estimates than increasing the sample size in terms of the number of individuals.

5
In case of poor separation between latent classes, the bias increases and the class predictions accuracy decreases, the results should be interpreted with caution.

6
Small latent groups with few events (heavy censoring) should be characterized with caution, since the parameter estimations can be considerably biased.
As for the real data application results, using the joint latent class model for the described data is beneficial. Indeed, the latency remains in data after adjustment on covariates known from clinical expertise. Note however that the observed ALSFRS profiles are rather distinguished, i.e. the observed data are close to the high separation setting, implying better general results. As shown by simulations, in case of lower separation, it could be more difficult to obtain and interpret the latent classes. Moreover, the results obtained from the whole and reduced samples differ (results not presented). Thus, care should be taken when interpreting the parameters derived from small samples due to possible bias and inference problems resulting from departures from normality.
In the present paper, we focus on JLCM as the approach to account for unobserved heterogeneity when modelling censored longitudinal outcomes. Other alternatives to this approach exist as the mixed latent Markov models proposed by Bartolucci et al.
Conclusion
The JLCM properties have been evaluated. We have illustrated the discovery in practice and highlights the usefulness of the joint models with latent classes in this kind of data even with prespecified factors. We made some recommendations for the use of this model and for future research. Further work is needed to assess the role of covariates, their place in different submodels of the JLCM, and the impact of their omission on parameter estimations and class membership identification. Also, precise recommendations concerning a minimum number of events or individuals needed to obtain satisfactory results within the JLCM can be formulated. Impact of longitudinal observation frequency on parameter estimations and latent classes identification can also be study considered in further work.
Availability of data and materials
The TROPHOS dataset analysed during the current study is publically available from https://pubmed.ncbi.nlm.nih.gov/27713255/
Abbreviations
 JLCM:

Joint latent class model
 MLE:

Maximum likelihood estimations
 GMM:

growth mixture model
 MMSE:

Minimental state examination
 ALS:

amyotrophic lateral sclerosis
 SVC:

slow vital capacity
 ALSFRSr:

amyotrophic lateral sclerosis functional rating scale revised, NIPPV noninvasive positive pressure ventilation
 BIC:

bayesian information criterion
References
 1
Rizopoulos D. Joint models for longitudinal and timetoevent data: With applications in R. London: Chapman & Hall; 2012.
 2
Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997; 53:330–9.
 3
Verbeke G, Lesaffre E. A linear mixedeffects model with heterogeneity in the randomeffects population. J Am Stat Assoc. 1996; 91(433):217–21.
 4
Muthén B, Shedden K. Finite mixture modeling with mixture outcomes using the em algorithm. Biometrics. 1999; 55(2):463–9.
 5
Lin H, Turnbull BW, McCulloch CE, Slate EH. Latent class models for joint analysis of longitudinal biomarker and event process data: application to longitudinal prostatespecific antigen readings and prostate cancer. J Am Stat Assoc. 2002; 97(457):53–65.
 6
ProustLima C, Dartigues JF, JacqminGadda H. Joint modeling of repeated multivariate cognitive measures and competing risks of dementia and death: a latent process and latent class approach. Stat Med. 2016; 35(3):382–98.
 7
ProustLima C, Séne M, Taylor JM, JacqminGadda H. Joint latent class models for longitudinal and timetoevent data: A review. Stat Methods Med Res. 2014; 23(1):74–90.
 8
Syrjälä E, Nevalainen J, Peltonen J, Takkinen HM, Hakola L, Åkerlund M, Veijola R, Ilonen J, Toppari J, Knip M, Virtanen SM. A joint modeling approach for childhood meat, fish and egg consumption and the risk of advanced islet autoimmunity. Sci Rep. 2019; 9(1):7760. https://0doiorg.brum.beds.ac.uk/10.1038/s41598019441961. Accessed 02 Jul 2019.
 9
Brilleman SL, MorenoBetancur M, Polkinghorne KR, McDonald SP, Crowther MJ, Thomson J, Wolfe R. Changes in body mass index and rates of death and transplant in hemodialysis patients: a latent class joint modeling approach. Epidemiology. 2019; 30(1):38–47. https://0doiorg.brum.beds.ac.uk/10.1097/EDE.0000000000000931. Accessed 10 May 2019.
 10
Ogata S, Watanabe M, Kokubo Y, Higashiyama A, Nakao YM, Takegami M, Nishimura K, Nakai M, Kiyoshige E, Hosoda K, Okamura T, Miyamoto Y. Longitudinal trajectories of fasting plasma glucose and risks of cardiovascular diseases in middle age to elderly people within the general Japanese population: the Suita Study. J Am Heart Assoc. 2019; 8(3):010628. https://0doiorg.brum.beds.ac.uk/10.1161/JAHA.118.010628.
 11
Portegies MLP, Mirza SS, Verlinden VJA, Hofman A, Koudstaal PJ, Swanson SA, Ikram MA. Midto latelife trajectories of blood pressure and the risk of stroke: the Rotterdam Study. Hypertension (Dallas, Tex.: 1979). 2016; 67(6):1126–32. https://0doiorg.brum.beds.ac.uk/10.1161/HYPERTENSIONAHA.116.07098.
 12
Jiang G, Luk AOY, Tam CHT, Xie F, Carstensen B, Lau ESH, Lim CKP, Lee HM, Ng ACW, Ng MCY, Ozaki R, Kong APS, Chow CC, Yang X, Lan HY, Tsui SKW, Fan X, Szeto CC, So WY, Chan JCN, Ma RCW, Hong Kong Diabetes Register TRS Study Group. Progression of diabetic kidney disease and trajectory of kidney function decline in Chinese patients with Type 2 diabetes. Kidney Int. 2019; 95(1):178–87. https://0doiorg.brum.beds.ac.uk/10.1016/j.kint.2018.08.026.
 13
Marioni RE, ProustLima C, Amieva H, Brayne C, Matthews FE, Dartigues JF, JacqminGadda H. Cognitive lifestyle jointly predicts longitudinal cognitive decline and mortality risk. Eur J Epidemiol. 2014; 29(3):211–9. https://0doiorg.brum.beds.ac.uk/10.1007/s1065401498818.
 14
Qin Y, Tian Y, Han H, Liu L, Ge X, Xue H, Wang T, Zhou L, Liang R, Yu H. Risk classification for conversion from mild cognitive impairment to Alzheimer’s disease in primary care. Psychiatry Res. 2019; 278:19–26. https://0doiorg.brum.beds.ac.uk/10.1016/j.psychres.2019.05.027. Accessed 02 Jul 2019.
 15
Stamenic D, Rousseau A, Essig M, Gatault P, Buchler M, Filloux M, Marquet P, Prémaud A. A prognostic tool for individualized prediction of graft failure risk within ten years after kidney transplantation. J Transplant. 2019; 2019:7245142. https://0doiorg.brum.beds.ac.uk/10.1155/2019/7245142.
 16
ProustLima C, Joly P, Dartigues JF, JacqminGadda H. Joint modelling of multivariate longitudinal outcomes and a timetoevent: a nonlinear latent class approach. Comput Stat Data Anal. 2009; 53(4):1142–54.
 17
Ferrer L, Rondeau V, Dignam J, Pickles T, JacqminGadda H, ProustLima C. Joint modelling of longitudinal and multistate processes: application to clinical progressions in prostate cancer. Stat Med. 2016; 35(22):3933–48.
 18
Rouanet A, Joly P, Dartigues JF, ProustLima C, JacqminGadda H. Joint latent class model for longitudinal data and intervalcensored semicompeting events: Application to dementia. Biometrics. 2016; 72(4):1123–35.
 19
Commenges D, JacqminGadda H. Modèles Biostatistiques Pour L’épidémiologie. France: De Boeck Superieur; 2015.
 20
Proust C, JacqminGadda H. Estimation of linear mixed models with a mixture of distribution for the random effects. Comput Methods Prog Biomed. 2005; 78(2):165–73.
 21
Tofighi D, Enders CK. Identifying the correct number of classes in growth mixture models. Adv Latent Variable Mixture Model. 2008; 2007:317–41.
 22
Babykina G, Couallier V. Empirical assessment of the maximum likelihood estimator quality in a parametric counting process model for recurrent events. Comput Stat Data Anal. 2012; 56(2):297–315.
 23
Sirvanci M, Yang G. Estimation of the weibull parameters under type i censoring. J Am Stat Assoc. 1984; 79(385):183–7.
 24
Yates E, Rafiq M. Prognostic factors for survival in patients with amyotrophic lateral sclerosis: analysis of a multicentre clinical trial. J Clin Neurosci. 2016; 32:51–6.
 25
Chio A, Logroscino G, Hardiman O, Swingler R, Mitchell D, Beghi E, Traynor BG, Consortium E, et al. Prognostic factors in ALS: a critical review. Amyotroph Lateral Scler. 2009; 10(56):310–23.
 26
Zinman L, Cudkowicz M. Emerging targets and treatments in amyotrophic lateral sclerosis. Lancet Neurol. 2011; 10(5):481–90.
 27
Lenglet T, Lacomblez L, Abitbol J, Ludolph A, Mora J, Robberecht W, Shaw P, Pruss R, Cuvier V, Meininger V, et al. A phase II III trial of olesoxime in subjects with amyotrophic lateral sclerosis. Eur J Neurol. 2014; 21(3):529–36.
 28
Brooks BR, Miller RG, Swash M, Munsat TL. El escorial revisited: revised criteria for the diagnosis of amyotrophic lateral sclerosis. Amyotroph Lateral Scler Other Motor Neuron Disorders. 2000; 1(5):293–9.
Acknowledgements
The authors wish to acknowledge support from the ARSLA charity (Association pour la Recherche sur la Sclérose Latérale Amyotrophique et autres maladies du motoneurones). We thank Valerie Cuvier, PierreFrançois Pradat, Vincent Meininger for the of data of the Trophos prospective cohort study (TRO19622). The study has been funded by ARLSA charity.
Funding
Not applicable.
Author information
Affiliations
Contributions
MK, AD, GB, JL participated in the design and conduct of the study. MK and GB writing the manuscript. MK, GB performed the statistical analysis and revised the manuscript. CT helped the statistical simulation and analysis. DD helped in the design of application method. MK, AD, GB, JL, CT and DD conceived of the design and coordination of the study and helped revising the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable
Consent for publication
Not applicable
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Kyheng, M., Babykina, G., Ternynck, C. et al. Joint latent class model: Simulation study of model properties and application to amyotrophic lateral sclerosis disease. BMC Med Res Methodol 21, 198 (2021). https://0doiorg.brum.beds.ac.uk/10.1186/s12874021013779
Received:
Accepted:
Published:
DOI: https://0doiorg.brum.beds.ac.uk/10.1186/s12874021013779
Keywords
 Joint model
 Latent classes
 Survival analysis
 Linear mixed model
 MLE properties
 Monte Carlo simulations
 Amyotrophic lateral sclerosis