Data School
Data School
  • Видео 148
  • Просмотров 11 610 885
Course outline: "Master Machine Learning with scikit-learn"
This is the outline of my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn
For all paid courses, I offer location-based discounts (up to 75%) to people in 160+ countries. Check your discount here: courses.dataschool.io/discounts
Enroll in a FREE Data Science course here: courses.dataschool.io/free-courses
Просмотров: 2 205

Видео

Course overview: "Master Machine Learning with scikit-learn"
Просмотров 1 тыс.2 месяца назад
This is the overview of my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn For all paid courses, I offer location-based discounts (up to 75%) to people in 160 countries. Check your discount here: courses.dataschool.io/discounts Enroll in a FREE Data Science course here: courses.dataschool.io/free-courses
Introduction to model ensembling
Просмотров 6952 месяца назад
Learn the how & why of "ensembling", the surprisingly simple way to make better Machine Learning predictions! P.S. This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn For all paid courses, I offer location-based discounts (up to 75%) to people in 160 countries. Check your discount ...
How to save a scikit-learn Pipeline with custom transformers
Просмотров 1,1 тыс.2 месяца назад
If you need to save a Pipeline with custom transformers, you’ll have to define the functions it depends upon in the new environment. In this lesson, you’ll learn how avoid that burden by using the cloudpickle library. P.S. This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn For all...
Should I shuffle samples with cross-validation?
Просмотров 8402 месяца назад
By default, the cross_val_score function in scikit-learn does not shuffle samples. In this lesson, you’ll learn when you might need to shuffle and how to do it. P.S. This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn For all paid courses, I offer location-based discounts (up to 75...
Cost-sensitive learning in scikit-learn
Просмотров 9472 месяца назад
If your dataset has significant class imbalance, the "cost" may differ between the two types of prediction errors. In this lesson, you’ll learn how to use cost-sensitive learning to adjust the model to better match your priorities. P.S. This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit...
scikit-learn vs Deep Learning
Просмотров 1,3 тыс.3 месяца назад
In an age of Deep Learning, I think scikit-learn is still well worth mastering. In this lesson, you'll find out why! P.S. This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn For all paid courses, I offer location-based discounts (up to 75%) to people in 160 countries. Check your di...
How to read the scikit-learn documentation
Просмотров 3,2 тыс.4 месяца назад
In order to become truly proficient with scikit-learn, you need to be able to read the documentation. In this video, I'll walk you through the five main pages and page types that you need to be familiar with: - API reference: List of classes and functions in each module - Class documentation: Detailed view of a class - User Guide: Advice for proper usage of a class or function - Examples: More ...
My top 50 scikit-learn tips
Просмотров 12 тыс.Год назад
If you already know the basics of scikit-learn, but you want to be more efficient and get up-to-date with the latest features, then THIS is the video for you. My name is Kevin Markham, and I've been teaching Machine Learning in Python with scikit-learn for more than 8 years. Over the next 3 hours, I'm going to share with you my top 50 scikit-learn tips. Each tip ranges from 2 to 8 minutes, and ...
21 more pandas tricks
Просмотров 48 тыс.2 года назад
You're about to learn 21 tricks that will help you to work faster, write better pandas code, and impress your friends. These are the BEST tricks that I couldn't fit into my FIRST tricks video! 📔 JUPYTER NOTEBOOK: nbviewer.org/github/justmarkham/pandas-videos/blob/master/21_more_pandas_tricks.ipynb 🔥 MY TOP 25 PANDAS TRICKS: ruclips.net/video/RlIiVeig3hc/видео.html 🐼 MORE PANDAS VIDEOS: ruclips....
Adapt this pattern to solve many Machine Learning problems
Просмотров 12 тыс.2 года назад
Here's a simple pattern that can be adapted to solve many ML problems. It has plenty of shortcomings, but can work surprisingly well as-is! Shortcomings include: - Assumes all columns have proper data types - May include irrelevant or improper features - Does not handle text or date columns well - Does not include feature engineering - Ordinal encoding may be better - Other imputation strategie...
Tune multiple models simultaneously with GridSearchCV
Просмотров 7 тыс.2 года назад
You can tune 2 models using the same grid search! Here's how: 1. Create multiple parameter dictionaries 2. Specify the model within each dictionary 3. Put the dictionaries in a list 👉 New tips every TUESDAY and THURSDAY! 👈 🎥 Watch all tips: ruclips.net/p/PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6 🗒️ Code for all tips: github.com/justmarkham/scikit-learn-tips 💌 Get tips via email: scikit-learn.tips WANT...
Access part of a Pipeline using slicing
Просмотров 2,7 тыс.2 года назад
Want to operate on part of a Pipeline (instead of the whole thing)? Slice it using Python's slicing notation! 👉 New tips every TUESDAY and THURSDAY! 👈 🎥 Watch all tips: ruclips.net/p/PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6 🗒️ Code for all tips: github.com/justmarkham/scikit-learn-tips 💌 Get tips via email: scikit-learn.tips WANT TO GET BETTER AT MACHINE LEARNING? 1) LEARN THE FUNDAMENTALS in my intr...
Tune the parameters of a VotingClassifer or VotingRegressor
Просмотров 4,9 тыс.2 года назад
Want to improve the accuracy of your VotingClassifier? Try tuning the 'voting' and 'weights' parameters to change how predictions are combined! P.S. If you're using VotingRegressor, just tune the 'weights' parameter 👉 New tips every TUESDAY and THURSDAY! 👈 🎥 Watch all tips: ruclips.net/p/PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6 🗒️ Code for all tips: github.com/justmarkham/scikit-learn-tips 💌 Get tips...
Ensemble multiple models using VotingClassifer or VotingRegressor
Просмотров 10 тыс.2 года назад
Want to improve your classifier's accuracy? Create multiple models and ensemble them using VotingClassifier! P.S. VotingRegressor is also available 👉 New tips every TUESDAY and THURSDAY! 👈 🎥 Watch all tips: ruclips.net/p/PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6 🗒️ Code for all tips: github.com/justmarkham/scikit-learn-tips 💌 Get tips via email: scikit-learn.tips WANT TO GET BETTER AT MACHINE LEARNING...
Create feature interactions using PolynomialFeatures
Просмотров 7 тыс.2 года назад
Create feature interactions using PolynomialFeatures
Speed up GridSearchCV using parallel processing
Просмотров 4,9 тыс.2 года назад
Speed up GridSearchCV using parallel processing
Use OrdinalEncoder instead of OneHotEncoder with tree-based models
Просмотров 4,2 тыс.2 года назад
Use OrdinalEncoder instead of OneHotEncoder with tree-based models
Passthrough some columns and drop others in a ColumnTransformer
Просмотров 4,4 тыс.2 года назад
Passthrough some columns and drop others in a ColumnTransformer
Drop the first category from binary features (only) with OneHotEncoder
Просмотров 3 тыс.2 года назад
Drop the first category from binary features (only) with OneHotEncoder
Estimators only print parameters that have been changed
Просмотров 1,9 тыс.2 года назад
Estimators only print parameters that have been changed
Load a toy dataset into a DataFrame
Просмотров 3,2 тыс.2 года назад
Load a toy dataset into a DataFrame
Get the feature names output by a ColumnTransformer
Просмотров 9 тыс.2 года назад
Get the feature names output by a ColumnTransformer
Create an interactive diagram of a Pipeline in Jupyter
Просмотров 4,7 тыс.2 года назад
Create an interactive diagram of a Pipeline in Jupyter
Most parameters should be passed as keyword arguments
Просмотров 2,8 тыс.2 года назад
Most parameters should be passed as keyword arguments
Don't use .values when passing a pandas object to scikit-learn
Просмотров 3 тыс.2 года назад
Don't use .values when passing a pandas object to scikit-learn
Add feature selection to a Pipeline
Просмотров 8 тыс.2 года назад
Add feature selection to a Pipeline
Use FunctionTransformer to convert functions into transformers
Просмотров 7 тыс.2 года назад
Use FunctionTransformer to convert functions into transformers
Use AUC to evaluate multiclass problems
Просмотров 8 тыс.2 года назад
Use AUC to evaluate multiclass problems
Shuffle your dataset when using cross_val_score
Просмотров 7 тыс.2 года назад
Shuffle your dataset when using cross_val_score

Комментарии

  • @Universe4mi
    @Universe4mi 5 дней назад

    Thanks, very clear and insightful!!

  • @ibrahimmamo9459
    @ibrahimmamo9459 6 дней назад

    thanks

  • @queryhsje7514
    @queryhsje7514 7 дней назад

    Hi, thank you for the video series! Do you have the most up to date Pandas Video series or they are still valid for current situation? Are there any big changes in Pandas for the past 6 years since your videos were made? Thank you.

  • @fargin1133
    @fargin1133 15 дней назад

    can someone explain why this---drinks.groupby('continent').mean() is not working

  • @shreyaparthiban5593
    @shreyaparthiban5593 20 дней назад

    i love when he goes okay??🤨

  • @naman_goyal.04
    @naman_goyal.04 23 дня назад

    Great👍

  • @theraizadatalks14
    @theraizadatalks14 26 дней назад

    Crystal and clear explanations, thanks for creating this series. This Pandas series helped in my daily work life. Much love ❤️

  • @Thanos-v1v
    @Thanos-v1v 29 дней назад

    Should I create a GitHub account after watching this? or should I complete the series for better understanding and install later? (I am actually new to Data Science and have no idea)

    • @dataschool
      @dataschool 28 дней назад

      I think creating a GitHub account now is a great idea!

  • @AmirTaghavey
    @AmirTaghavey Месяц назад

    Very insightful indeed -- thank you.

    • @dataschool
      @dataschool 28 дней назад

      You're very welcome!

  • @partspieces8165
    @partspieces8165 Месяц назад

    Your videos are so useful, the subscribe button is too clickable

  • @mitchellyula4447
    @mitchellyula4447 Месяц назад

    Thank you so much! I never used the groupby function until now and this helped me complete my pandas assignment! very clear explanation and your videos have helped me a ton!

    • @dataschool
      @dataschool 28 дней назад

      Glad it helped! 🙌

  • @raneshmitra8156
    @raneshmitra8156 Месяц назад

    drinks.groupby('continent').mean(numeric_only = True)

  • @raneshmitra8156
    @raneshmitra8156 Месяц назад

    orders.choice_description.str.replace('[\[\]]','',regex = True)

  • @xandrviking1113
    @xandrviking1113 Месяц назад

    Thanks Kevin 👍🤝 . In 2024 it still relevant to learn .

  • @brianwaweru9089
    @brianwaweru9089 2 месяца назад

    One thing about this guy is that he gives very deep insights which you'll get nowhere else. As much as possible he'll give best practises, I have observed this from way back in the pandas course. Thanks so much Kevin. Please do deep learning and in-depth feature engineering tricks in a future video.

    • @dataschool
      @dataschool 28 дней назад

      Thank you so much for your kind words! 🙏 And thanks also for your suggestions, I'll keep them in mind!

  • @dataschool
    @dataschool 2 месяца назад

    Is the mean() method not working for you? You need to include the argument numeric_only=True, for example: drinks.mean(numeric_only=True). This is a new requirement in pandas for cases in which you want to calculate the mean of numeric rows or columns and the DataFrame contains non-numeric data. Hope that helps!

    • @raneshmitra8156
      @raneshmitra8156 Месяц назад

      Thank you for your update...... Your explanation is truly awesome.......

  • @guruprakashsoma9143
    @guruprakashsoma9143 2 месяца назад

    sir the mean function is not working for me

    • @dataschool
      @dataschool 2 месяца назад

      You need to include the argument numeric_only=True, for example: drinks.mean(numeric_only=True). This is a new requirement in pandas for cases in which you want to calculate the mean of numeric rows or columns and the DataFrame contains non-numeric data. Hope that helps!

  • @119busanovic
    @119busanovic 2 месяца назад

    best pandas tutorial

  • @aditimohapatra312
    @aditimohapatra312 2 месяца назад

    sir why in the last 2 cases where we didn't specify, in there with mean it is not executed but with min, max and count, it is being executed without showing any error? same for the visual form also?? help

  • @atifdai313
    @atifdai313 2 месяца назад

    I am using the yearly data....Suppose my data is showing 33 rows and 20 columns (20 columns also including the years (1999 to 2022) in my summary stat analysis. How can I exclude the year's column from my whole analysis? OR I should delete the year's column. Please guide us further regarding any data shape command.

  • @bilalahmad9177
    @bilalahmad9177 2 месяца назад

    You are a great instructor. I have learned a lot from you regarding pandas. The video with title "How do I merge DataFrames in pandas?" has left some queries in my mind. I would be thankful to you if you clear those too. What type of join is used here movie_ratings = pd.merge(movies , ratings)? if it is inner join it should result in 1682 rows in total in movie_ratings dataframe, as movies dataframe has 1682 rows. But in video i have observed that movie_ratings results in 100,000 rows of data.

  • @y_limit_yourself
    @y_limit_yourself 2 месяца назад

    Sir, you are the GOAT 🐐🐐

    • @dataschool
      @dataschool 2 месяца назад

      You are too kind! 🙌

  • @monotonous_0
    @monotonous_0 2 месяца назад

    If mean is not working for you: We first have to drop 'country' and 'continent' columns, these columns contain strings so we can't do mean with them. drinks = drinks.drop(['continent','country'],axis = 1)

    • @ujan_saheli
      @ujan_saheli 2 месяца назад

      Thanks

    • @dataschool
      @dataschool 2 месяца назад

      Alternatively, you can include the argument numeric_only=True, for example: drinks.mean(numeric_only=True). That way, you can still perform the mean operation without dropping data that you might want to keep. Hope that helps!

  • @testtest-ws7uc
    @testtest-ws7uc 2 месяца назад

    Hello, for a dataframe with 5000 rows and 13 columns how do we impute multiple entries. Some are numeric and some are categorical

  • @Astute_
    @Astute_ 2 месяца назад

    while performing the mean operation, it shows that it could not convert the country's name to numeric , its an error. What to do?

    • @dataschool
      @dataschool 2 месяца назад

      You need to include the argument numeric_only=True, for example: drinks.mean(numeric_only=True). This is a new requirement in pandas for cases in which you want to calculate the mean of numeric rows or columns and the DataFrame contains non-numeric data. Hope that helps!

  • @mospher9253
    @mospher9253 2 месяца назад

    Could you do Pytorch tips like you dod with sklearn ?

    • @dataschool
      @dataschool 2 месяца назад

      Thanks for the suggestion, I'll consider it for the future!

  • @Astute_
    @Astute_ 2 месяца назад

    The shift tab trick is not working (I have windows and I am operating on vs code , jupyter notebook)

  • @soumyadeepsarkar2119
    @soumyadeepsarkar2119 2 месяца назад

    6:50

  • @crigar001
    @crigar001 2 месяца назад

    Es posible obtener mas descuento solo tengo $50 para este curso, estamos en Colombia y esto a ca esta dificil.

    • @dataschool
      @dataschool 2 месяца назад

      Thanks for your interest! I automatically offer a 65% discount to people in Colombia, bringing the cost down from $299 to $105. However, I'm willing to offer greater discounts on a case-by-case basis. Please email me to follow up: kevin at dataschool dot io. Thanks!

  • @bellanatrisha1201
    @bellanatrisha1201 2 месяца назад

    omg...thank you so muchhhh

    • @dataschool
      @dataschool 2 месяца назад

      You're welcome! I'm glad it was helpful to you!

  • @shanthidinakaran5574
    @shanthidinakaran5574 2 месяца назад

    Thank you so much for all your Pandas sessions, it was very detailed and covered almost all required basics.. !!!

    • @dataschool
      @dataschool 2 месяца назад

      You're very welcome!

  • @anikaverma9667
    @anikaverma9667 2 месяца назад

    just found your channel a few days ago , thanks for helping and Happy marriage ( a lil too late but... 😁)

    • @dataschool
      @dataschool 2 месяца назад

      Thank you so much! 🙌

  • @sedighehnadaei1895
    @sedighehnadaei1895 2 месяца назад

    As always you did great.thank you so much ❤

    • @dataschool
      @dataschool 2 месяца назад

      You are so welcome!

  • @Induraj11
    @Induraj11 2 месяца назад

    wow.. much appreciate ur efforts sir.. i learned Pandas 3 years before purely from ur videos.. it helped me to get job as well.. i am very thankful to you. ❤

    • @dataschool
      @dataschool 2 месяца назад

      That is excellent to hear, thanks so much for letting me know! 🙌

  • @samderrty123
    @samderrty123 2 месяца назад

    What about the math concept that comes with this?

    • @dataschool
      @dataschool 2 месяца назад

      Great question! I touch on mathematical concepts when they are relevant to the course, but the course is highly practical, and most of the underlying math does not have to be deeply understood in order for you to be effective with Machine Learning. Hope that helps!

  • @dataschool
    @dataschool 2 месяца назад

    This is the outline of my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn

  • @vikasingle3972
    @vikasingle3972 2 месяца назад

    Very excited..!

    • @dataschool
      @dataschool 2 месяца назад

      Thanks! I hope you enjoy the course!

  • @HARSHRAJ-2023
    @HARSHRAJ-2023 2 месяца назад

    I am from India and your course is way too costly.

    • @dataschool
      @dataschool 2 месяца назад

      Thanks for sharing! Actually, I offer a 75% discount to people living in India. You can visit this page to access your discount code - courses.dataschool.io/discounts - or you can email me at kevin@dataschool.io

    • @HARSHRAJ-2023
      @HARSHRAJ-2023 2 месяца назад

      @@dataschool That's a great discount. Thanks Kevin.

    • @dataschool
      @dataschool 2 месяца назад

      You're very welcome! I hope you enjoy the course!

    • @hazmashahidchoudrychoudry1693
      @hazmashahidchoudrychoudry1693 2 месяца назад

      ​@@dataschool hey Kevin what's about Pakistani peoples .... I'm from Pakistan

  • @dataschool
    @dataschool 2 месяца назад

    This is the overview of my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn

  • @ranatanzeel1053
    @ranatanzeel1053 2 месяца назад

    Thanks ❤

  • @dataschool
    @dataschool 2 месяца назад

    This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn

  • @Malbao14
    @Malbao14 2 месяца назад

    Amazing! Thank you for the tip!

    • @dataschool
      @dataschool 2 месяца назад

      You’re very welcome! Glad it’s helpful!

  • @dataschool
    @dataschool 2 месяца назад

    This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn

  • @freenrg888
    @freenrg888 2 месяца назад

    7 years later, this helped me. Thank you.

  • @SodaPy_dot_com
    @SodaPy_dot_com 3 месяца назад

    so far so good

  • @aleksandartta
    @aleksandartta 3 месяца назад

    Hello Kevin, thank you very much... I have two questions: 1) after hyper parameters tunning and cross validation, the final model should be some that is trained on the whole dataset (meaning train + validation set)? Am I right? 2) do we need cross validation if the dataset is very big (and how to know how big :) ? i.e. when cross validation is not necessary?

    • @dataschool
      @dataschool 3 месяца назад

      Great questions! 1. Yes, re-train the tuned model on the entire dataset (meaning all samples for which you know the target value). 2. Yes, cross-validation is a useful model evaluation procedure with any size dataset, with the possible exception of a very tiny dataset. (Below a certain number of samples, no model evaluation procedure is particularly useful.) Hope that helps!

  • @dataschool
    @dataschool 3 месяца назад

    This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn

  • @dataschool
    @dataschool 3 месяца назад

    This is a lesson from my NEW course, "Master Machine Learning with scikit-learn." You can enroll here: courses.dataschool.io/master-machine-learning-with-scikit-learn