Important Concepts

<aside> 🔴

Data Splitting

</aside>

Train-test split is a way to check how well a machine learning model works.

Ex- After learning 80 maths problems we can practise with 20 other problems.

Train data: Used to teach the model.
Test data: Used to see how the model performs on unseen data.
Validation data: Used to check the model's performance during training and make it better.

Copy of Add a subheading.gif

In Python, you can do this with train_test_split() from scikit-learn.

from sklearn.model_selection import train_test_split

xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)

<aside> 🔴

</aside>

Ans:

Limited training data

Model complexity