Use stratified sampling with train_test_split

Опубликовано: 05 Август 2021
на канале: Data School
19,797
440

Are you using train_test_split with a classification problem?
Be sure to set "stratify=y" so that class proportions are preserved when splitting.
Especially important if you have class imbalance!

👉 New tips every TUESDAY and THURSDAY! 👈

🎥 Watch all tips:    • scikit-learn tips  
🗒️ Code for all tips: https://github.com/justmarkham/scikit...
💌 Get tips via email: https://scikit-learn.tips


=== WANT TO GET BETTER AT MACHINE LEARNING? ===

1) LEARN THE FUNDAMENTALS in my intro course (free!): https://courses.dataschool.io/introdu...

2) BUILD YOUR ML CONFIDENCE in my intermediate course: https://courses.dataschool.io/buildin...

3) LET'S CONNECT!
Newsletter: https://www.dataschool.io/subscribe/
Twitter:   / justmarkham  
Facebook:   / datascienceschool  
LinkedIn:   / justmarkham