International Journal of Data Mining, Modelling and Management, cilt.16, sa.2, ss.196-208, 2024 (ESCI)
The classification problem is the way in which a new observation belongs to a set of categories, using known features. For example, categorising e-mails as necessary or unnecessary, or finding a diagnosis of a disease using a patient’s various values (such as gender, blood pressure, presence of various symptoms). Various methods are used in classification processes. In this study, the classification performance of ordinal logistic regression, which is a statistical method, was investigated. It has been revealed how the classification success of the method changes when the data set properties change. For this, a simulation study was carried out by deriving data sets with different properties with the help of the R program. As a result of the simulation study, it was observed that the correlation structure in the data set, the sample size, the number and distribution of the response variable categories affected the classification performance of the method. Suggestions have been made to improve the classification performance of the ordinal logistic regression method.