Benchmarking Variants of Recursive Feature Elimination: Insights from Predictive Tasks in Education and Healthcare
Abstract
This study systematically explores Recursive Feature Elimination (RFE) and its variants as effective methods for feature selection within educational data mining (EDM) and healthcare analytics. Feature selection is crucial in predictive modeling due to challenges arising from high dimensionality and data complexity. The paper first reviews foundational principles of the RFE algorithm, highlighting its iterative mechanism for ranking and removing less informative features. Next, it provides a comprehensive narrative review of various RFE modifications, including integration with different machine learning models, utilization of multiple feature importance metrics, algorithmic enhancements through cross-validation and local search techniques, and hybrid approaches combining RFE with other dimensionality reduction methods. Subsequently, an empirical evaluation compares four RFE variants—standard RFE, RF-RFE, Enhanced RFE, and RFE with local search—across two distinct predictive tasks: regression for predicting students’ mathematics achievement using a digital assessment dataset, and classification for predicting chronic heart failure in a clinical dataset. Results indicate that while RF-RFE captured complex feature interactions and improved predictive performance slightly, Enhanced RFE provided substantial dimensionality reduction with minimal loss in accuracy, offering a balanced approach beneficial for practical applications. The study concludes by discussing implications, limitations, and recommendations for effectively applying RFE in educational and clinical contexts.
Related articles
Related articles are currently not available for this article.