Morphological characterization of the Polatli sheep in terms of live weight using data mining algorithms


Delialioglu R. A., Pehlivan E., Altay Y.

Tropical Animal Health and Production, cilt.55, sa.6, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 55 Sayı: 6
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1007/s11250-023-03811-0
  • Dergi Adı: Tropical Animal Health and Production
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Agricultural & Environmental Science Database, Aquatic Science & Fisheries Abstracts (ASFA), BIOSIS, CAB Abstracts, Environment Index, Veterinary Science Database
  • Anahtar Kelimeler: C&RT, CHAID, Data mining, MARS, Morphological characterization, Polatli sheep
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

The aim of this research is both to estimate the live weight (LW) of Polatli sheep (Ile de France × Akkaraman (G1)) by considering some body measurements (withers height (WH), rump height (RH), body length (BL), chest depth (CD), chest width (CD), chest girth (CG), cannon bone circumference (CBC)), age, and sex variables as independent variables using C&RT (Classification and Regression Tree), CHAID (Chi-square Automatic Interaction Detector), and MARS (Multivariate Adaptive Regression Splines) algorithms and to determine the significant independent variables in the estimation of live weight. For this purpose, a total of 210 sheep were used, including 180 females and 30 males of different ages, for the estimation of LW. The calculated Pearson correlation coefficients between LW and WH, RH, BL, CD, CW, CG, and CBC characteristics are 0.897, 0.896, 0.853, 0.948, 0.550, 0.914, and 0.798, respectively (p < 0.05). In the application of data mining algorithms as prediction models, a cross-validation of 10 was used, while for tree-based algorithms, the parent node was set to 10 and the child node to 5. While CHAID and C&RT algorithms each used 8 independent variables to explain the variation observed in LW, the MARS algorithm used 9 independent variables. In Polatli sheep, the sheep with the highest live weight was found in the node with age > 3 and CD > 36 cm cutting point in the CHAID algorithm (93.571 kg). In the C&RT algorithm, it was predicted to be (91.316 kg) when age > 0, CD > 36.5 cm, and CBC > 9.5 cm. When evaluated considering commonly used criteria, the prediction performances of CHAID, C&RT, and MARS algorithms were calculated as follows: RMSE (root mean square error) values are “5.788, 5.103, 4.005”; SDR (standard deviation ratio) values are “0.254, 0.224, 0.176”; MAPE (mean absolute percentage error) values are “7.555, 6.675, 5.682”; Adj-Rsq (adjusted R-squared) values are “0.935, 0.950, 0.969”; and AIC (Akaike information criterion) values are “741.436, 688.489, 582.792,” respectively. In terms of prediction performance, among the tree algorithms (CHAID and C&RT), C&RT was found to be the best, while considering all performance measures, it was observed that the MARS algorithm exhibited the best performance. Consequently, it has been determined that C&RT and MARS algorithms can be safely employed in morphological characterization studies for the identification of indirect criteria and the formation of elite herds in terms of LW. This decision allows for the reliable use of these algorithms to facilitate the selection of indirect variables and the establishment of elite populations in breeding programs focusing on live weight.