At times, in machine learning we always face the questions like :
- How do we decide which model to use?
- Which model will be more efficient?
- Which parameters are the best for a model?
Cross Validation is a concept which we can use to validate different models and find the best tuning parameters.
How does cross validation works?
- First, we divide our dataset in k number of partitions called folds
- Then we use one of the folds as testing set and remaining folds as training set
- We calculate the testing accuracy and repeat the steps such that we have used each fold as testing set and others as training set
- After that we take the average of these accuracy and estimate the combined accuracy by finding the mean
- We do the same for multiple models and pick up the best model for our use case