Return to page


Generalized Linear Model

What is GLM?

Generalized linear models or GLM are a form of advanced statistical modeling especially in non-normal distributions. GLM helps to explain how probability distributions can combine for modeling. GLM also summarizes several different research outcomes. There are several types of generalized linear models that include poisson, linear regression, and custom generalized linear models. The normal distribution uses an identity function, poisson uses the log-link function, and binomial distribution uses the logit function. GLM uses a common training technique for several different kinds of models. 


Examples of GLM

Types of generalized linear models include: 

Classical Linear Regression: These are also referred to as linear regression models and are used on real-valued and negative-valued data sets. This is often the first model chosen before moving to more complex options. These models have high interpretability and are able to be fitted to the data set. However, they can only be used if the data has an additive relationship. 

Poisson and Negative Binomial Regression Models: These models are for explaining event counts. They help anticipate the number of incidents that are likely to occur in different locations and situations. Poisson models use the log-link function. 

Logit Models: These are used for ratios of counts and predict outcomes like odds of winning and probability of a mechanical object failing. In logit models or logistic regression, the logit function is used. 

Other Models: There are other kinds of models used to predict time to next failure of parts and people and models to anticipate lifespan of living and non - living things. 


Why is GLM important?

GLM is important to create models from data sets. But in order for it to work properly, there are three components that need to be present. Each generalized linear model needs a systematic component/linear predictor, a link function, and random component or probability distribution. Once these conditions are met, generalized linear models are able to create models with other distributions from the exponential family of distributions, something other methods are not able to do. GLM is the foundation for running statistical tests like the t-test, canonical correlation, discriminant function analysis, and cluster analysis. 


How is GLM used?

GLMs can be used on various data sets to create models. Different model options can be chosen based on the information given. However, it is important to note that generalized linear models cannot be used on modeling auto - correlated time series data. Just as in other modeling techniques, there are assumptions when following GLM. In order to get best results, data should be independent and random, the response variable doesn’t need to be randomly distributed but needs to come from an exponential family, and the transformed response variable needs to be linearly dependent on the independent variable.