Return to page WIKI

Model Fitting

What is Model Fitting?

Model Fitting is a measurement of how well a machine learning model adapts to data that is similar to the data on which it was trained. The fitting process is generally built-in to models and is automatic. A well-fit model will accurately approximate the output when given new data, producing more precise results. A model is fitted by adjusting the parameters within the model, leading to improvements in accuracy. During the fitting process, the algorithm is run on test data, otherwise known as “labeled” data. Once the algorithm has finished running, the results need to be compared to real and observed values of the target variable, in order to identify the accuracy of the model. Using the results, the parameters of the algorithm can be further adjusted to better uncover relationships and patterns between the inputs, outputs, and targets. The process can be repeated until valid and accurate insights are given. 


What Does a Well-Fit Model Look Like? 

A well-fit model should closely match the available data while also closely matching the general curves of the model. No model will be able to perfectly match the input data, but a well-fit model will be able to closely match the data and general shapes. In the image below, it is important to note that the line does not match every individual data point, but does capture the general curve.


Why is Model Fitting Important?

As previously mentioned, a well-fit model does not match every data point given but follows the overall curves. This shows that the model is neither underfit nor overfit. If a model is not properly fit, it will produce incorrect insights and should not be used for making decisions. 


Underfitting occurs when a model oversimplifies the data and fails to capture enough information on the relationships within it. This is frequently a result of insufficient model training time. An underfit model can be identified when a model performs poorly on the training data. 


Overfitting is the opposite of underfitting. It transpires when a model is overly sensitive to the data within, which results in an over-analysis of the patterns within. Overfitting is generally a result of overtraining on training data sets. It can be identified when a model performs well on the data used for training, but does not adapt to new data and performs poorly. 

A Well-Fit Model

A Well-fit model performs well on training data and on evaluation data, due to correct hyperparameters that capture the relationships between the variables and target variables. Generally, fitting is an automatic process where the hyperparameters are adjusted individually and automatically to best suit the data inputted. The use of well-fit models enables users to make better decisions and draw accurate insights




Model Fitting Resources with H2O

An In-Depth Guide to Generalized Linear Modeling with H2O