BMC Pregnancy and Childbirth, cilt.26, sa.1, 2026 (SCI-Expanded, Scopus)
Background: Small for gestational age (SGA) is a significant concern in obstetrics, with implications for stillbirth, neonatal mortality, and long-term health outcomes. The early detection of SGA is crucial for prevention and treatment, but current methods have limitations. This study aimed to develop a machine learning (ML) models-based algorithm to predict SGA using sociodemographic and obstetric features during the preconception period. Methods: We retrospectively analyzed first-trimester attendees (1 Jan 2022–31 Dec 2023) and developed parity-stratified prediction models (nulliparous vs. primiparous) using routinely available sociodemographic and obstetric variables at the first prenatal visit. Five algorithms (logistic regression, random forest, XGBoost, LightGBM, and extra trees) were trained using an 80/20 stratified train–test split with 5-fold cross-validation. Model performance was assessed using AUC-ROC, accuracy, sensitivity, and specificity. Reporting was guided by TRIPOD + AI recommendations for prediction model development and validation. Results: Among nulliparous women, logistic regression achieved accuracy 72.7% and AUC 0.733 (95% CI for accuracy 0.464–0.990). Among primiparous women, XGBoost achieved accuracy 80% and AUC 0.92 (95% CI for accuracy 0.552–1.000). Anthropometric variables (weight, BMI, height) and previous birth weight (primiparous) were most influential predictors. Conclusion: An ML model constructed with basic maternal sociodemographic findings and obstetric history may serve as an early prediction tool for SGA during the preconception period, particularly in resource-constrained settings, although broader validation is required.