- Help Center
- Machine Learning
-
Data Science Bootcamp
-
Python Programming
-
Machine Learning
-
Data Analysis
-
Pricing
-
Registration
-
R Language
-
SQL
-
Power BI
-
Homework and Notebooks
-
Platform Related Issues
-
Programming and Tools
-
Large Language Models Bootcamp
-
Blog
-
Employment Assistance
-
Partnerships
-
Data Science for Business
-
Python for Data Science
-
Introduction to Power BI
how to know whether you have collected enough data to train your machine learning model or not?
It is not easy to know how many samples you need to collect. However you can follow these steps:
For solving a typical ML problem:
- Build a dataset a with a few samples, how many? it will depend on the kind of problem you have, don't spend a lot of time now.
- Split your dataset into train, cross, test and build your model.
- Now that you've built the ML model, you need to evaluate how good it is. Calculate your test error
- If your test error is beneath your expectation, collect new data and repeat steps 1-3 until you hit a test error rate you are comfortable with.
This method will work if your model is not suffering "high bias"