Evaluating the Performances of Robust Logistic Regression Models in the Presence of Outliers.
Publication Date: 02/12/2024
Author(s): Hafiza Inusa Idris, Abdulmalik Mohammed, Umar Faruk Salisu, Kamalu Ibrahim Balansana, Danjuma Abdulazeez, Nurudden Hassan Danrimi.
Volume/Issue: Volume 7 , Issue 4 (2024)
Abstract:
Logistic regression models are widely used in the field of medical and behavioral sciences. These models are used to describe the effect of explanatory variables on a binary response variable. The maximum likelihood estimator (MLE) is commonly used to estimate the parameters of logistic regression models due to its efficiency under a parametric model. However, evidence has shown that the MLE is highly sensitive to outlying observations which might affect the parameter estimates. Robust methods are put forward to rectify this problem. This paper investigated the robustness of GM-Mallows and GM-Schweppes as an alternative to the commonly used ordinary logistic regression model in the presence of outliers. The study used a Monte Carlos Simulation, by generating a logistic regression model with Five independent normally distributed covariates. 5 of outliers was contaminated to the data on sample sizes 50, 200 and 400 respectively. The results showed that the GM-Mallows estimator perform best across all metrics having the lowest AIC, BIC, MSE and MAE except for n=50. This suggests that the robust methods, especially GM-Mallows, provide more reliable estimates in the presence of outliers. the finding suggests that if there is presence of outliers’ GM-Mallows appears to be the top choice, where the GM-Schweppes offers a middle ground, providing some robustness with perhaps less extreme adjustments. The ordinary logistic regression might be preferred if simplicity and interpretability are prioritized, and there's confidence that outliers are not a significant issue in the data.
Keywords:
Robust, logistic regression, GM-Mallows, GM-Schweppes, Outliers.