On Remedying the Presence of Heteroscedasticity in a Multiple Linear Regression Modelling .
Publication Date: 24/06/2024
Author(s): Emmanuel Uchenna Ohaegbulem, Victor Chijindu Iheaka.
Volume/Issue: Volume 7 , Issue 2 (2024)
Abstract:
This study demonstrated the very essence of remedying the presence of heteroscedasticity, where it existed, in regression modelling. Two different hypothetical data, Data A (the Original) and Data B (the Original), were used in this study for the purpose of illustration. The normality, multicollinearity and autocorrelation assumptions were satisfied, but the Breusch-Pagan test and the White test established the existences of heteroscedasticity in the two datasets. The estimated multiple linear regression model for Data A (the Original) was statistically significant with an R-square value of 0.976, an AIC value of 332.5929, and an SBC value of 347.2533; and the one for Data B (the Original) was also statistically significant with an R-square value of 0.553, an AIC value of 69.89669, and an SBC value of 82.15499. The Log-transformation was applied on the variables in Data A (the Original) and Data B (the Original) to give rise to new sets of data, Data A (Now with Heteroscedasticity Remedied) and Data B (Now with Heteroscedasticity Remedied); which equally satisfied the normality, multicollinearity and autocorrelation assumptions, and also satisfied that there were no existences of heteroscedasticity in the two datasets. Now, the estimated multiple linear regression model for Data A (Now with Heteroscedasticity Remedied) was statistically significant with an R-square value of 0.986, an AIC value of -135.021, and an SBC value of -120.361; and the estimated model for Data B (Now with Heteroscedasticity Remedied) was statistically significant with an R-square value of 0.624, an AIC value of -32.0801, and an SBC value of -19.8218. From the points of view of the values of the R-square (0.986>0.976 and 0.624>0.553), AIC (-135.021<332.5929 and -32.0801<69.89669) and SBC (-120.361<347.2533 and -19.8218<82.15499), it was evident that the estimated regression models for Data A (Now with Heteroscedasticity Remedied) and Data B (Now with Heteroscedasticity Remedied) were, respectively, better models when compared to the regression models for Data A (the Original) and Data B (the Original).
Keywords:
Multiple linear regression analysis, Correlation analysis, Heteroscedasticity, Autocorrelation, Multicollinearity, Remedying.