HELA: A novel hybrid ensemble learning algorithm for predicting academic performance of students


Keser S., Aghalarova S.

EDUCATION AND INFORMATION TECHNOLOGIES, cilt.27, sa.4, ss.4521-4552, 2022 (SSCI) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 27 Sayı: 4
  • Basım Tarihi: 2022
  • Doi Numarası: 10.1007/s10639-021-10780-0
  • Dergi Adı: EDUCATION AND INFORMATION TECHNOLOGIES
  • Derginin Tarandığı İndeksler: Social Sciences Citation Index (SSCI), Scopus, Communication Abstracts, EBSCO Education Source, Educational research abstracts (ERA), ERIC (Education Resources Information Center), INSPEC
  • Sayfa Sayıları: ss.4521-4552
  • Anahtar Kelimeler: Educational data mining, Ensemble learning algorithm, Prediction of academic performance, Classification
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

Education plays a major role in the development of the consciousness of the whole society. Education has been improved by analyzing educational data related to student academic performance. By using data mining techniques and algorithms on data from the educational environment, students' performances can be predicted. In this study, a novel Hybrid Ensemble Learning Algorithm (HELA) is proposed to predict the academic performance of students. The prediction results obtained from base classifiers namely Gradient Boosting, Extreme Gradient Boosting, Light Gradient Boosting Machine, and different combinations of these algorithms are given as input to the Super Learner algorithm. Hyper-parameters of base classifiers are optimized with a Random Search algorithm. Students' performances in Math and Portuguese classes are predicted by the proposed algorithm. In the experimental results, 96.6% and 91.2% accuracy values are obtained for the Mathematics course, and the Portuguese course, respectively. This paper is the first study, to our knowledge, to integrate the boosting and stacking-based ensemble learning algorithm for the prediction of students' academic performance that gives better predictive results with high efficiency.