Overfitting is a common problem in machine learning and statistics.
It means that the model has been fit so tight to the learning sample data that it fails miserable when new previously unseen data is presented to the trained model.
When overfitting has occured also the noise is included in the model. The best way of thinking of overfitting is when the model has started to memorize the data rather than learn a model that is able to work on the data (a model that is also valid for new unseen data).
There are various steps one can take to avoid overfitting
- cross-validation
- regularization
- early stopping
- pruning
- Bayesian priors on parameters
- model comparison
- dropout
- restrict the complexity of the model