Diagnostics of multiple group influential observations for logistic regression models

Coskun B., ALPU Ö.

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, vol.89, no.16, pp.3118-3136, 2019 (SCI-Expanded) identifier identifier


In this paper, two new multiple influential observation detection methods, GCD.GSPR and mCD*, are introduced for logistic regression. The proposed diagnostic measures are compared with the generalized difference in fits (GDFFITS) and the generalized squared difference in beta (GSDFBETA), which are multiple influential diagnostics. The simulation study is conducted with one, two and five independent variable logistic regression models. The performance of the diagnostic measures is examined for a single contaminated independent variable for each model and in the case where all the independent variables are contaminated with certain contamination rates and intensity. In addition, the performance of the diagnostic measures is compared in terms of the correct identification rate and swamping rate via a frequently referred to data set in the literature.