Addressing Imbalanced Data for Health-related Travel Mode Choice


Creative Commons License

Akalın K. B., Guler S. I.

Annual Postdoctoral Research Symposium at Penn State, Pennsylvania, Amerika Birleşik Devletleri, 08 Aralık 2023, ss.1

  • Yayın Türü: Bildiri / Özet Bildiri
  • Doi Numarası: 10.13140/rg.2.2.14611.66083
  • Basıldığı Şehir: Pennsylvania
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.1
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

This study addresses health-related travel data, encompassing information on individuals, households, and regions, to construct a model for predicting travel mode choices. Despite the richness of the dataset, consisting of 12,480 trips, the distribution of travel mode choices is imbalanced, with 97% opting for private transport and only 3% for public transport. Thus, the direct application of a Logistic Model1,2, commonly used for mode choice modeling on this dataset is hindered by this imbalance. Because, during testing the model may erroneously allocate all instances to private transport, resulting in a misleading accuracy of 97%, while predicting 100% private transport and 0% public transport. This discrepancy necessitates the exploration of alternative data regularization, manipulation, or modeling methods.