My previous post covered the basics of logistic regression. We must now examine the model to understand how well it fits the data and generalizes to other observations. The evaluation process involves the assessment of three distinct areas – goodness of fit, tests of individual predictors, and validation of predicted values – in order to produce the most useful model. While the following content isn’t exhaustive, it should provide a compact ‘cheat sheet’ and guide for the modeling process. Goodness of Fit: Likelihood Ratio Test A logistic regression is said to provide a better fit to the data if it demonstrates an improvement over a model with fewer predictors. This occurs by comparing the likelihood of the data under the full model against the likelihood of the data under a model with fewer predictors. The null hypothesis, holds that the reduced model is true,so an for the overall model fit statistic that is less than would compel us to reject .
Logistic regression is used to analyze the relationship between a dichotomous dependent variable and one or more categorical or continuous independent variables. It specifies the likelihood of the response variable as a function of various predictors. The model expressed as , where refers to the parameters and represents the independent variables. The , or log of the odds ratio, is defined as . It expresses the natural logarithm of the ratio between the probability that an event will occur, , to the probability that an event will not occur, . The models estimates, , express the relationship between the independent and dependent variable on a log-odds scale. A coefficient of would indicate that a one unit difference in is associated with a log-odds increase in the occurce of by . To get a clearer understanding of the constant effect of a predictor on the likelihood that an outcome will occur, odds-ratios can be calculated. This can be expressed as , which is the exponentiate of