If you are getting this warning message when using the generalized linear model function for data analysis, then you have a predictor variable that perfectly predicts the predicted variable. This happens to match your expectation of the data, this warning can be safely ignored. Otherwise, you should change the family argument in the glm algorithm function to ensure your model isn’t effectively cheating. Perfect prediction is statistically very unlikely in computational statistics – an independent variable is effectively sneaking a peek at the dependent variable.
Description of the warning – algorithm did not converge
The generalized linear models function is a logistic regression similar to the logit function. It produces a model predicting the response variable based on the explanatory variable. If the algorithm does not end in a convergence, you will get our warning message. When using the binomial family in the function, if the maximum likelihood estimates result in a 1 or 0 the prediction is too perfect, and the convergence does not occur. Avoiding this message does not require perfect separation of the response variable and explanatory variable, because simply changing one value prevents the message from occurring. However, normally you do not have the luxury of simply changing the values.
Explanation of the warning
Here is an example of code that creates this warning message
> df = data.frame(x=c(1,2,3,4,5,6,7,8,9,10),
+ y=c(0,0,0,0,0,1,1,1,1,1))
>
> glm(y~x, data=df, family=”binomial”)
Call: glm(formula = y ~ x, family = “binomial”, data = df)
Coefficients:
(Intercept) x
-245.8 44.7
Degrees of Freedom: 9 Total (i.e. Null); 8 Residual
Null Deviance: 13.86
Residual Deviance: 7.865e-10 AIC: 4
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
In this example, the generalized linear models (glm) function produces a one hundred percent probability of getting a value for y of zero if x is less than six and one if x is greater than five. Now preventing this warning from occurring does not require complete separation between x and y. All that is necessary is to change a single value and the warning goes away. Unfortunately, in real-life situations, we do not usually have the option of arbitrarily changing the data points.
How to fix the warning – algorithm did not converge
In this example we do not need to change any of the values to fix this warning.
> df = data.frame(x=c(1,2,3,4,5,6,7,8,9,10),
+ y=c(0,0,0,0,0,1,1,1,1,1))
>
> glm(y~x, data=df, family=”gaussian”)
Call: glm(formula = y ~ x, family = “gaussian”, data = df)
Coefficients:
(Intercept) x
-0.3333 0.1515
Degrees of Freedom: 9 Total (i.e. Null); 8 Residual
Null Deviance: 2.5
Residual Deviance: 0.6061 AIC: 6.345
All that we need to do is change the family parameter of the generalized linear models function from “binomial” to “gaussian”. As a result, the formula no longer produces fitted probabilities of one and zero and the warning message is fixed. There is no need to change values, it is just a simple matter of changing a single argument in the parameter of the generalized linear models function.
The generalized linear models function only seems to produce this warning message when the family argument is set to “binomial”. Because of this fact, all that is needed is to change that one argument. Changing the family argument to “gaussian” works but others should work as well. This warning message is an easy one to get when you do not have control over the input. It is also an easy one to fix when it comes up.