Skip to main content
Fig. 2 | BMC Medical Research Methodology

Fig. 2

From: Multimorbidity in middle-aged women and COVID-19: binary data clustering for unsupervised binning of rare multimorbidity features and predictive modeling

Fig. 2

Unsupervised feature binning of rare features and generation of the Feature Matrix using new engineered features and other features: First of all data pertaining to prevalent features are sliced out. On the remaining data which contain the non-prevalent features, the clustering is applied. The process involves both feature-level clustering, where features are grouped into clusters using the BMD algorithm, and data-level clustering, where patients’ records are grouped into clusters. These tasks are interconnected as features within each cluster are used to create FBMs. Subsequently, data-level clustering is performed on these FBMs to assign patients’ records into clusters. Thus value obtained from data level clustering act as new features to replace original sparse data. The ultimate objective is to construct an engineered FM by combining these new bins with prevalent features, ensuring that both prevalent and combinations of non-prevalent features are considered for predictive modeling

Back to article page