title
How to find the best model parameters in scikit-learn

description
In this video, you'll learn how to efficiently search for the optimal tuning parameters (or "hyperparameters") for your machine learning model in order to maximize its performance. I'll start by demonstrating an exhaustive "grid search" process using scikit-learn's GridSearchCV class, and then I'll compare it with RandomizedSearchCV, which can often achieve similar results in far less time. Download the notebook: https://github.com/justmarkham/scikit-learn-videos Grid search user guide: http://scikit-learn.org/stable/modules/grid_search.html GridSearchCV documentation: http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html RandomizedSearchCV documentation: http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html Comparing randomized search and grid search: http://scikit-learn.org/stable/auto_examples/model_selection/plot_randomized_search.html Randomized search video: https://youtu.be/0wUF_Ov8b0A?t=17m38s Randomized search notebook: https://github.com/amueller/pydata-nyc-advanced-sklearn/blob/master/Chapter%203%20-%20Randomized%20Hyper%20Parameter%20Search.ipynb Random Search for Hyper-Parameter Optimization: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf WANT TO GET BETTER AT MACHINE LEARNING? HERE ARE YOUR NEXT STEPS: 1) WATCH my scikit-learn video series: https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A 2) SUBSCRIBE for more videos: https://www.youtube.com/dataschool?sub_confirmation=1 3) JOIN "Data School Insiders" to access bonus content: https://www.patreon.com/dataschool 4) ENROLL in my Machine Learning course: https://www.dataschool.io/learn/ 5) LET'S CONNECT! - Newsletter: https://www.dataschool.io/subscribe/ - Twitter: https://twitter.com/justmarkham - Facebook: https://www.facebook.com/DataScienceSchool/ - LinkedIn: https://www.linkedin.com/in/justmarkham/

detail
{'title': 'How to find the best model parameters in scikit-learn', 'heatmap': [{'end': 454.134, 'start': 414.204, 'weight': 0.765}, {'end': 525.481, 'start': 475.83, 'weight': 0.704}, {'end': 688.99, 'start': 634.01, 'weight': 1}, {'end': 747.9, 'start': 707.782, 'weight': 0.711}, {'end': 843.27, 'start': 794.065, 'weight': 0.712}, {'end': 1104.617, 'start': 1082.847, 'weight': 0.775}, {'end': 1185.511, 'start': 1141.158, 'weight': 0.716}, {'end': 1320.687, 'start': 1266.567, 'weight': 0.725}], 'summary': 'Covers k-fold cross-validation for tuning parameters, gridsearchcv for finding optimal k value, instantiation of a grid for cross-validation on a knn model, model optimization, comparing gridsearchcv and randomizedsearchcv to achieve high scores in less time, and the recommendation of random search for parameter tuning.', 'chapters': [{'end': 287.844, 'segs': [{'end': 119.709, 'src': 'embed', 'start': 59.1, 'weight': 0, 'content': [{'end': 68.223, 'text': 'First, we choose a number for k, often 10, and split the dataset into k partitions or folds of equal size.', 'start': 59.1, 'duration': 9.123}, {'end': 80.647, 'text': 'The model is trained on all of the folds except one, then tested on the remaining fold, and evaluated using the chosen evaluation metric.', 'start': 69.343, 'duration': 11.304}, {'end': 93.481, 'text': 'That process is repeated k-1 more times, such that each fold is the testing set once and the training set all other times.', 'start': 81.934, 'duration': 11.547}, {'end': 100.325, 'text': 'The average testing performance, also known as the cross-validated performance,', 'start': 94.541, 'duration': 5.784}, {'end': 104.547, 'text': "is used as the estimate of the model's performance on out-of-sample data.", 'start': 100.325, 'duration': 4.222}, {'end': 112.787, 'text': 'That performance estimate is more reliable than the estimate provided by train test split,', 'start': 106.445, 'duration': 6.342}, {'end': 119.709, 'text': 'because cross-validation reduces the variance associated with a single trial of train test split.', 'start': 112.787, 'duration': 6.922}], 'summary': "Using k-fold cross-validation, model's performance on out-of-sample data is estimated with reduced variance.", 'duration': 60.609, 'max_score': 59.1, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA59100.jpg'}, {'end': 253.095, 'src': 'embed', 'start': 224.689, 'weight': 2, 'content': [{'end': 230.651, 'text': 'the number of cross-validation folds to use and the evaluation metric of our choice.', 'start': 224.689, 'duration': 5.962}, {'end': 238.995, 'text': "We're choosing to use tenfold cross-validation with classification accuracy as the evaluation metric.", 'start': 231.872, 'duration': 7.123}, {'end': 253.095, 'text': 'Cross-val score returned 10 scores, namely the testing accuracy for each of the 10 folds used during cross-validation.', 'start': 244.028, 'duration': 9.067}], 'summary': 'Using tenfold cross-validation, testing accuracy scores were obtained for each of the 10 folds.', 'duration': 28.406, 'max_score': 224.689, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA224689.jpg'}], 'start': 0.569, 'title': 'K-fold cross-validation', 'summary': 'Discusses the use of k-fold cross-validation for tuning parameters, efficiency improvement, searching for multiple tuning parameters, and reducing computational expense, using the example of selecting the best value for k for the knn model on the iris dataset.', 'chapters': [{'end': 287.844, 'start': 0.569, 'title': 'K-fold cross-validation for model optimization', 'summary': 'Discusses the use of k-fold cross-validation for tuning parameters, efficiency improvement, searching for multiple tuning parameters, and reducing computational expense, using the example of selecting the best value for k for the knn model on the iris dataset.', 'duration': 287.275, 'highlights': ["k-fold cross-validation splits the dataset into k partitions, trains the model on all folds except one, and tests it on the remaining fold, repeating this process k-1 times to estimate the model's performance on out-of-sample data. This process involves training the model on all folds except one, testing it on the remaining fold, and repeating the process k-1 times, providing a more reliable estimate of the model's performance on out-of-sample data.", 'Cross-validation reduces the variance associated with a single trial of train test split and can be used for selecting tuning parameters, choosing between models, and selecting features. Cross-validation reduces variance associated with a single trial of train test split, providing more reliable estimates, and it can be used for various tasks such as selecting tuning parameters, choosing between models, and selecting features.', 'The CrossValScore function is used to select the best tuning parameters, providing testing accuracy for each of the 10 folds used during cross-validation. The CrossValScore function returns testing accuracy for each of the 10 folds used during cross-validation, allowing the selection of the best tuning parameters.']}], 'duration': 287.275, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA569.jpg', 'highlights': ['Cross-validation reduces variance associated with a single trial of train test split, providing more reliable estimates, and it can be used for various tasks such as selecting tuning parameters, choosing between models, and selecting features.', "k-fold cross-validation splits the dataset into k partitions, trains the model on all folds except one, and tests it on the remaining fold, repeating this process k-1 times to estimate the model's performance on out-of-sample data.", 'The CrossValScore function returns testing accuracy for each of the 10 folds used during cross-validation, allowing the selection of the best tuning parameters.']}, {'end': 511.894, 'segs': [{'end': 470.266, 'src': 'heatmap', 'start': 414.204, 'weight': 0, 'content': [{'end': 422.592, 'text': 'Even though the code above is not particularly difficult to write, you could imagine that this is something you might want to do very often,', 'start': 414.204, 'duration': 8.388}, {'end': 427.857, 'text': 'and thus it would be nice if there was a function that could automate this process for you.', 'start': 422.592, 'duration': 5.265}, {'end': 433.102, 'text': 'That is exactly why GridSearchCV was created.', 'start': 428.998, 'duration': 4.104}, {'end': 446.17, 'text': 'GridSearch CV allows you to define a set of parameters that you want to try with a given model,', 'start': 438.407, 'duration': 7.763}, {'end': 454.134, 'text': 'and it will automatically run cross-validation using each of those parameters, keeping track of the resulting scores.', 'start': 446.17, 'duration': 7.964}, {'end': 461.779, 'text': 'Essentially, it replaces the for loop above, as well as providing some additional functionality.', 'start': 455.314, 'duration': 6.465}, {'end': 470.266, 'text': 'To get started with GridSearchCV, we first import the class from sklearn.gridsearch.', 'start': 463.16, 'duration': 7.106}], 'summary': 'Gridsearchcv automates parameter testing and cross-validation in machine learning.', 'duration': 171.822, 'max_score': 414.204, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA414204.jpg'}], 'start': 298.444, 'title': 'Grid search cv for model parameter tuning', 'summary': 'Introduces using grid search cv to find the optimal k value for a knn model by iterating through a range of k values from 1 to 30, performing tenfold cross-validation and plotting the accuracy results. it also discusses the benefits of gridsearchcv for automating the process of trying different parameters with a given model and how it simplifies the process by running cross-validation and tracking resulting scores.', 'chapters': [{'end': 397.989, 'start': 298.444, 'title': 'Grid search cv for knn model', 'summary': 'Introduces the process of using grid search cv to find the optimal k value for a knn model by iterating through a range of k values from 1 to 30, performing tenfold cross-validation and plotting the accuracy results.', 'duration': 99.545, 'highlights': ['Performing tenfold cross-validation to find the mean cross-validated accuracy for each of the 30 iterations, resulting in the fifth score being 0.966 for n neighbors equals 5.', 'Iterating through a range of K values from 1 to 30 to find the optimal K value for generalization to out-of-sample data.', "Plotting the K values against the accuracy to visualize the relationship between K values and the model's accuracy.", 'Using grid search CV to replace the for loop process for choosing the optimal K value for the KNN model.']}, {'end': 511.894, 'start': 405.331, 'title': 'Automating model parameter tuning with gridsearchcv', 'summary': 'Discusses the benefits of gridsearchcv for automating the process of trying different parameters with a given model, using the example of specifying k values for a model, and how it simplifies the process by running cross-validation and tracking resulting scores.', 'duration': 106.563, 'highlights': ['GridSearchCV allows you to define a set of parameters and automatically run cross-validation using each of those parameters, replacing the need for manual for loops and providing additional functionality.', 'Values 13, 18, and 20 for k appear to be the best, demonstrating the effectiveness of GridSearchCV in identifying optimal parameters for the model.', 'Creating a parameter grid involves specifying a Python dictionary with the parameter name as the key and a list of values to be searched for that parameter, providing an organized and efficient way to explore different parameter combinations.']}], 'duration': 213.45, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA298444.jpg', 'highlights': ['Performing tenfold cross-validation to find the mean cross-validated accuracy for each of the 30 iterations, resulting in the fifth score being 0.966 for n neighbors equals 5.', 'Values 13, 18, and 20 for k appear to be the best, demonstrating the effectiveness of GridSearchCV in identifying optimal parameters for the model.', 'Using grid search CV to replace the for loop process for choosing the optimal K value for the KNN model.', 'GridSearchCV allows you to define a set of parameters and automatically run cross-validation using each of those parameters, replacing the need for manual for loops and providing additional functionality.', 'Iterating through a range of K values from 1 to 30 to find the optimal K value for generalization to out-of-sample data.']}, {'end': 658.887, 'segs': [{'end': 658.887, 'src': 'embed', 'start': 548.664, 'weight': 0, 'content': [{'end': 560.916, 'text': "It's an object that is ready to do tenfold cross-validation on a KNN model using classification accuracy as the evaluation metric.", 'start': 548.664, 'duration': 12.252}, {'end': 575.436, 'text': 'But, in addition, it has been given this parameter grid so that it knows that it should repeat the tenfold cross-validation process 30 times,', 'start': 562.217, 'duration': 13.219}, {'end': 583.3, 'text': 'and each time the nNeighbors parameter should be given a different value from the list.', 'start': 575.436, 'duration': 7.864}, {'end': 592.766, 'text': 'Hopefully this helps you to understand why the parameter grid is specified using key-value pairs.', 'start': 585.581, 'duration': 7.185}, {'end': 602, 'text': "We can't just give GridSearchCV a list of the numbers 1 through 30 because it won't know what to do with those numbers.", 'start': 593.891, 'duration': 8.109}, {'end': 614.125, 'text': 'Instead, we need to specify which model parameter, in this case nNeighbors, should take on the values 1 through 30.', 'start': 603.041, 'duration': 11.084}, {'end': 616.786, 'text': 'One final note about instantiating the grid.', 'start': 614.125, 'duration': 2.661}, {'end': 623.187, 'text': 'If your computer and operating system support parallel processing,', 'start': 617.926, 'duration': 5.261}, {'end': 631.749, 'text': 'you can set the end jobs parameter to negative one to instruct scikit-learn to use all available processors.', 'start': 623.187, 'duration': 8.562}, {'end': 639.671, 'text': 'Finally, we fit the grid with data by just passing it the x and y objects.', 'start': 634.01, 'duration': 5.661}, {'end': 658.887, 'text': 'This step may take quite a while depending upon the model and the data and the number of parameters being searched.', 'start': 651.064, 'duration': 7.823}], 'summary': 'Using gridsearchcv to perform tenfold cross-validation with 30 repetitions and different values for nneighbors parameter on a knn model.', 'duration': 110.223, 'max_score': 548.664, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA548664.jpg'}], 'start': 511.894, 'title': 'Gridsearchcv for knn model', 'summary': 'Discusses the instantiation of a grid for tenfold cross-validation on a knn model using a parameter grid to repeat the process 30 times with different values from a list, as well as setting the end jobs parameter for parallel processing and fitting the grid with data.', 'chapters': [{'end': 658.887, 'start': 511.894, 'title': 'Gridsearchcv for knn model', 'summary': 'Discusses the instantiation of a grid for tenfold cross-validation on a knn model using a parameter grid to repeat the process 30 times with different values from a list, as well as setting the end jobs parameter for parallel processing and fitting the grid with data.', 'duration': 146.993, 'highlights': ['The grid object is ready to do tenfold cross-validation on a KNN model using classification accuracy as the evaluation metric and repeat the process 30 times with different values for the nNeighbors parameter from the list.', 'The parameter grid is specified using key-value pairs to ensure that the model parameter nNeighbors takes on the values 1 through 30.', 'Setting the end jobs parameter to negative one instructs scikit-learn to use all available processors for parallel processing.']}], 'duration': 146.993, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA511894.jpg', 'highlights': ['The grid object is ready to do tenfold cross-validation on a KNN model using classification accuracy as the evaluation metric and repeat the process 30 times with different values for the nNeighbors parameter from the list.', 'The parameter grid is specified using key-value pairs to ensure that the model parameter nNeighbors takes on the values 1 through 30.', 'Setting the end jobs parameter to negative one instructs scikit-learn to use all available processors for parallel processing.']}, {'end': 963.617, 'segs': [{'end': 747.9, 'src': 'heatmap', 'start': 659.888, 'weight': 0, 'content': [{'end': 671.592, 'text': 'Remember that this is running tenfold cross-validation 30 times and thus the KNN model is being fit and predictions are being made 300 times.', 'start': 659.888, 'duration': 11.704}, {'end': 680.676, 'text': "Now that grid search is done, let's take a look at the results which are stored in the grid scores attribute.", 'start': 673.673, 'duration': 7.003}, {'end': 688.99, 'text': 'This is actually a list of 30 named tuples.', 'start': 685.868, 'duration': 3.122}, {'end': 696.133, 'text': 'The first tuple indicates that when the nNeighbors parameter was set to 1,', 'start': 689.89, 'duration': 6.243}, {'end': 707.782, 'text': 'the mean cross-validated accuracy was 0.96 and the standard deviation of the accuracy scores was 0.05..', 'start': 696.133, 'duration': 11.649}, {'end': 718.727, 'text': 'While the mean is usually what we pay attention to, the standard deviation can be useful to keep in mind, because if the standard deviation is high,', 'start': 707.782, 'duration': 10.945}, {'end': 724.349, 'text': 'that means that the cross-validated estimate of the accuracy might not be as reliable.', 'start': 718.727, 'duration': 5.622}, {'end': 732.573, 'text': 'Anyway, you can see that there is one tuple for each of the 30 trials of cross-validation.', 'start': 725.89, 'duration': 6.683}, {'end': 747.9, 'text': "Next, I'll just show you how you can examine the individual tuples, just in case you need to do so in the future.", 'start': 740.535, 'duration': 7.365}], 'summary': 'Knn model ran 10-fold cv 30 times, yielding a mean accuracy of 0.96 with a std deviation of 0.05.', 'duration': 64.461, 'max_score': 659.888, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA659888.jpg'}, {'end': 883.642, 'src': 'heatmap', 'start': 794.065, 'weight': 3, 'content': [{'end': 800.071, 'text': "It's easy to collect the mean scores across the 30 runs and plot them like we did above.", 'start': 794.065, 'duration': 6.006}, {'end': 802.173, 'text': "So let's just quickly do that.", 'start': 800.711, 'duration': 1.462}, {'end': 811.701, 'text': "We'll use a list comprehension to loop through grid dot grid scores, pulling out only the mean score.", 'start': 804.114, 'duration': 7.587}, {'end': 830.548, 'text': 'And when we plot the results, we can see that the plot is identical to the one we generated above.', 'start': 815.905, 'duration': 14.643}, {'end': 843.27, 'text': "Now you might be thinking that writing a list comprehension and then making a plot can't possibly be the most efficient way to view the results of the grid search.", 'start': 833.208, 'duration': 10.062}, {'end': 845.171, 'text': "You're exactly right.", 'start': 844.15, 'duration': 1.021}, {'end': 852.232, 'text': 'Once a grid has been fit with data, it exposes three attributes that are quite useful.', 'start': 846.491, 'duration': 5.741}, {'end': 865.85, 'text': 'best score is the single best score achieved across all of the parameters.', 'start': 858.72, 'duration': 7.13}, {'end': 873.94, 'text': 'best params is a dictionary containing the parameters used to generate that score.', 'start': 865.85, 'duration': 8.09}, {'end': 883.642, 'text': 'and finally, best estimator is the actual model object fit with those best parameters,', 'start': 873.94, 'duration': 9.702}], 'summary': '30 runs mean scores plotted, best score, params, and estimator extracted.', 'duration': 82.931, 'max_score': 794.065, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA794065.jpg'}, {'end': 963.617, 'src': 'embed', 'start': 906.418, 'weight': 5, 'content': [{'end': 914.305, 'text': 'but some limited testing indicates that it probably just picks the first parameter it tested that achieved that score.', 'start': 906.418, 'duration': 7.887}, {'end': 929.253, 'text': "It might have occurred to you already that sometimes you'll want to search multiple different parameters simultaneously for the same model.", 'start': 921.368, 'duration': 7.885}, {'end': 938.718, 'text': "For example, let's pretend you're using a decision tree classifier, which is a model we haven't yet covered in the series.", 'start': 930.553, 'duration': 8.165}, {'end': 944.942, 'text': 'Two important tuning parameters are max depth and min samples leaf.', 'start': 939.999, 'duration': 4.943}, {'end': 957.369, 'text': 'You could tune those parameters independently, meaning that you try different values for max depth while leaving minSamplesLeaf at its default value,', 'start': 946.175, 'duration': 11.194}, {'end': 963.617, 'text': 'and then you try different values for minSamplesLeaf while leaving max depth at its default value.', 'start': 957.369, 'duration': 6.248}], 'summary': 'Testing indicates it picks first parameter achieving score. need to search multiple parameters simultaneously.', 'duration': 57.199, 'max_score': 906.418, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA906418.jpg'}], 'start': 659.888, 'title': 'Knn model cross-validation and grid search for model optimization', 'summary': "Details the results of running a knn model with tenfold cross-validation 30 times, resulting in 300 fits and predictions, with a mean cross-validated accuracy of 0.96 and a standard deviation of 0.05, providing insights into the model's reliability. it also explains the process of grid search for model optimization, discussing the examination of individual tuples, mean validation scores, efficient viewing of results using grid attributes, and the simultaneous tuning of multiple parameters.", 'chapters': [{'end': 724.349, 'start': 659.888, 'title': 'Knn model cross-validation results', 'summary': "Details the results of running a knn model with tenfold cross-validation 30 times, resulting in 300 fits and predictions, with a mean cross-validated accuracy of 0.96 and a standard deviation of 0.05, providing insights into the model's reliability.", 'duration': 64.461, 'highlights': ['The KNN model was fit and predictions were made 300 times, running tenfold cross-validation 30 times.', 'The mean cross-validated accuracy of the KNN model was 0.96.', 'The standard deviation of the accuracy scores was 0.05, indicating the reliability of the cross-validated estimate of the accuracy.']}, {'end': 963.617, 'start': 725.89, 'title': 'Grid search for model optimization', 'summary': 'Explains the process of grid search for model optimization, discussing the examination of individual tuples, mean validation scores, efficient viewing of results using grid attributes, and the simultaneous tuning of multiple parameters.', 'duration': 237.727, 'highlights': ['The chapter explains the process of grid search for model optimization. It covers the examination of individual tuples, mean validation scores, efficient viewing of results using grid attributes, and the simultaneous tuning of multiple parameters.', 'The grid exposes three attributes after fitting with data: best score, best parameters, and best estimator. It includes best score achieved across all parameters, a dictionary of best parameters, and the model object fit with best parameters.', 'The plot generated using the mean scores across the 30 runs is identical to the one generated previously. It demonstrates the consistency and reliability of the grid search results.', 'GridSearch may select the first parameter that achieved the best score in case of ties in scores. It indicates a potential criterion used by GridSearch to select the best parameter in case of tied scores.', 'The chapter discusses simultaneous tuning of multiple parameters for the same model, using the example of a decision tree classifier. It presents the scenario of tuning max depth and min samples leaf independently for a decision tree classifier.']}], 'duration': 303.729, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA659888.jpg', 'highlights': ['The KNN model was fit and predictions were made 300 times, running tenfold cross-validation 30 times.', 'The mean cross-validated accuracy of the KNN model was 0.96.', 'The standard deviation of the accuracy scores was 0.05, indicating the reliability of the cross-validated estimate of the accuracy.', 'The grid exposes three attributes after fitting with data: best score, best parameters, and best estimator. It includes best score achieved across all parameters, a dictionary of best parameters, and the model object fit with best parameters.', 'The plot generated using the mean scores across the 30 runs is identical to the one generated previously. It demonstrates the consistency and reliability of the grid search results.', 'The chapter discusses simultaneous tuning of multiple parameters for the same model, using the example of a decision tree classifier. It presents the scenario of tuning max depth and min samples leaf independently for a decision tree classifier.', 'The chapter explains the process of grid search for model optimization. It covers the examination of individual tuples, mean validation scores, efficient viewing of results using grid attributes, and the simultaneous tuning of multiple parameters.', 'GridSearch may select the first parameter that achieved the best score in case of ties in scores. It indicates a potential criterion used by GridSearch to select the best parameter in case of tied scores.']}, {'end': 1431.194, 'segs': [{'end': 1015.926, 'src': 'embed', 'start': 964.943, 'weight': 0, 'content': [{'end': 974.071, 'text': 'The problem with that approach is that the best model performance might be achieved when neither of those parameters are at their default values.', 'start': 964.943, 'duration': 9.128}, {'end': 983.179, 'text': "Thus, you need to search those two parameters simultaneously, which is exactly what we're about to do with GridSearchCV.", 'start': 975.332, 'duration': 7.847}, {'end': 990.071, 'text': 'In the case of K and N.', 'start': 987.99, 'duration': 2.081}, {'end': 996.115, 'text': 'another parameter that might be worth tuning other than K is the weights parameter,', 'start': 990.071, 'duration': 6.044}, {'end': 1001.417, 'text': 'which controls how the K nearest neighbors are weighted when making a prediction.', 'start': 996.115, 'duration': 5.302}, {'end': 1008.281, 'text': 'The default option is uniform, which means that all points in the neighborhood are weighted equally.', 'start': 1002.498, 'duration': 5.783}, {'end': 1015.926, 'text': 'But another option is distance, which weights closer neighbors more heavily than further neighbors.', 'start': 1009.342, 'duration': 6.584}], 'summary': 'Optimizing model parameters with gridsearchcv for k and n, including weights parameter tuning.', 'duration': 50.983, 'max_score': 964.943, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA964943.jpg'}, {'end': 1185.511, 'src': 'heatmap', 'start': 1082.847, 'weight': 2, 'content': [{'end': 1084.868, 'text': "Let's take a quick look at the results.", 'start': 1082.847, 'duration': 2.021}, {'end': 1104.617, 'text': 'You can see that there is a tuple for every possible combination of n neighbors and weights.', 'start': 1097.834, 'duration': 6.783}, {'end': 1110.36, 'text': "Once again, we'll examine the best score and the best parameters.", 'start': 1106.298, 'duration': 4.062}, {'end': 1123.315, 'text': 'It turns out that the best score did not improve and thus, using nNeighbors equals 13 with the default value for weights,', 'start': 1114.409, 'duration': 8.906}, {'end': 1126.478, 'text': 'is still the best set of parameters for this model.', 'start': 1123.315, 'duration': 3.163}, {'end': 1139.567, 'text': "Next, I want to briefly remind you of what to do with these optimal tuning parameters once you've found them.", 'start': 1132.302, 'duration': 7.265}, {'end': 1150.924, 'text': 'Before you make predictions on out of sample data, it is critical that you train the model with the best known parameters using all of your data.', 'start': 1141.158, 'duration': 9.766}, {'end': 1159.109, 'text': 'As you can see, I instantiate kNeighborsClassifier with the best parameters I found above.', 'start': 1152.425, 'duration': 6.684}, {'end': 1166.072, 'text': 'and then fit it with X and Y, not X train and Y train.', 'start': 1161.468, 'duration': 4.604}, {'end': 1173.219, 'text': 'In other words, even if you use train test split earlier in the model building process,', 'start': 1167.274, 'duration': 5.945}, {'end': 1178.604, 'text': 'you should train your model on X and Y before making predictions on new data.', 'start': 1173.219, 'duration': 5.385}, {'end': 1185.511, 'text': 'Otherwise, you will be throwing away potentially valuable data that your model can learn from.', 'start': 1179.585, 'duration': 5.926}], 'summary': "The best parameters for the model are nneighbors=13 with default weights, yielding the best score; it's crucial to train the model with these parameters using all data before making predictions.", 'duration': 90.372, 'max_score': 1082.847, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA1082847.jpg'}, {'end': 1237.118, 'src': 'embed', 'start': 1202.783, 'weight': 5, 'content': [{'end': 1212.072, 'text': 'By default, GridSearchCV will refit the model using the entire dataset and the best parameters it found.', 'start': 1202.783, 'duration': 9.289}, {'end': 1224.674, 'text': 'That fitted model is stored within the grid object and it exposes a predict method to allow you to make predictions using the fitted model.', 'start': 1213.45, 'duration': 11.224}, {'end': 1237.118, 'text': 'In other words, the code in this cell accomplishes the same thing as the code in this cell.', 'start': 1229.675, 'duration': 7.443}], 'summary': 'Gridsearchcv refits model with best parameters and provides predict method.', 'duration': 34.335, 'max_score': 1202.783, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA1202783.jpg'}, {'end': 1331.43, 'src': 'heatmap', 'start': 1266.567, 'weight': 6, 'content': [{'end': 1277.409, 'text': "The problem that randomized search CV aims to solve is that when you're performing an exhaustive search of many different parameters at once,", 'start': 1266.567, 'duration': 10.842}, {'end': 1281.29, 'text': 'the search can quickly become computationally infeasible.', 'start': 1277.409, 'duration': 3.881}, {'end': 1292.833, 'text': 'For example, searching 10 different parameter values for each of four parameters will require 10,000 trials of cross-validation,', 'start': 1282.611, 'duration': 10.222}, {'end': 1297.655, 'text': 'which would equate to 100,000 model fits and 100,000 sets of predictions if tenfold cross-validation is being used.', 'start': 1293.752, 'duration': 3.903}, {'end': 1320.687, 'text': 'Randomized Search CV solves this problem by searching only a random subset of the provided parameters and allowing you to explicitly control the number of different parameter combinations that are attempted.', 'start': 1305.242, 'duration': 15.445}, {'end': 1331.43, 'text': 'As such, you can effectively decide how long you want it to run for depending on the computational time you have available.', 'start': 1322.007, 'duration': 9.423}], 'summary': 'Randomized search cv solves exhaustive search computational infeasibility by searching random parameter subsets and controlling number of combinations tried.', 'duration': 26.188, 'max_score': 1266.567, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA1266567.jpg'}], 'start': 964.943, 'title': 'Grid and randomized search in machine learning', 'summary': 'Discusses the need for simultaneous parameter optimization using gridsearchcv for k and n, and highlights the importance of tuning the weights parameter for k nearest neighbors. it also explains the exhaustive grid search using gridsearchcv and introduces randomized searchcv as a solution for computationally infeasible searches.', 'chapters': [{'end': 1015.926, 'start': 964.943, 'title': 'Gridsearchcv for parameter optimization', 'summary': 'Discusses the need to simultaneously search for multiple parameters, like k and n, using gridsearchcv to optimize model performance, and also highlights the importance of tuning the weights parameter for k nearest neighbors, with options like uniform and distance weighting.', 'duration': 50.983, 'highlights': ['The need to simultaneously search for multiple parameters, like K and N, using GridSearchCV to optimize model performance', 'The importance of tuning the weights parameter for K nearest neighbors, with options like uniform and distance weighting']}, {'end': 1431.194, 'start': 1017.284, 'title': 'Grid and randomized search in machine learning', 'summary': 'Explains the exhaustive grid search using gridsearchcv and the importance of training the model with the best parameters and introduces randomized searchcv as a solution for computationally infeasible searches.', 'duration': 413.91, 'highlights': ['The GridSearchCV is trying every possible combination of the nNeighbors parameter and the weights parameter, resulting in 60 times tenfold cross-validation due to 30 options for nNeighbors and two options for weights.', 'The best set of parameters for the model remains nNeighbors equals 13 with the default value for weights, as it achieved the best score.', 'The importance of training the model with the best known parameters using all of the data is emphasized, ensuring valuable data is not discarded and the model can learn effectively.', 'The GridSearchCV provides a shortcut by refitting the model using the entire dataset and the best parameters it found, making it quite handy at times.', 'The Randomized SearchCV aims to solve the computational infeasibility issue by searching only a random subset of the provided parameters and allowing explicit control over the number of different parameter combinations attempted.']}], 'duration': 466.251, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA964943.jpg', 'highlights': ['The need to simultaneously search for multiple parameters, like K and N, using GridSearchCV to optimize model performance', 'The importance of tuning the weights parameter for K nearest neighbors, with options like uniform and distance weighting', 'The GridSearchCV is trying every possible combination of the nNeighbors parameter and the weights parameter, resulting in 60 times tenfold cross-validation due to 30 options for nNeighbors and two options for weights', 'The best set of parameters for the model remains nNeighbors equals 13 with the default value for weights, as it achieved the best score', 'The importance of training the model with the best known parameters using all of the data is emphasized, ensuring valuable data is not discarded and the model can learn effectively', 'The GridSearchCV provides a shortcut by refitting the model using the entire dataset and the best parameters it found, making it quite handy at times', 'The Randomized SearchCV aims to solve the computational infeasibility issue by searching only a random subset of the provided parameters and allowing explicit control over the number of different parameter combinations attempted']}, {'end': 1663.549, 'segs': [{'end': 1655.401, 'src': 'embed', 'start': 1471.06, 'weight': 0, 'content': [{'end': 1479.263, 'text': 'it still managed to find a combination that has just as high a score as the combination found by GridSearchCV.', 'start': 1471.06, 'duration': 8.203}, {'end': 1487.386, 'text': "It's certainly possible that randomized SearchCV will not find as good a result as GridSearchCV,", 'start': 1480.763, 'duration': 6.623}, {'end': 1498.318, 'text': 'but you might be surprised how often it finds the best result, or something very close, in a fraction of the time that gridSearchCV would have taken.', 'start': 1488.493, 'duration': 9.825}, {'end': 1511.466, 'text': "I've set up a quick experiment here in which I run randomizedSearchCV with n iter equals 10, and I repeat this process 20 times,", 'start': 1499.979, 'duration': 11.487}, {'end': 1513.907, 'text': 'recording the best score each time.', 'start': 1511.466, 'duration': 2.441}, {'end': 1534.457, 'text': "As you can see, most of the time it does result in a score of .98, and when it doesn't find that score, it's still close to .98.", 'start': 1521.661, 'duration': 12.796}, {'end': 1541.038, 'text': 'In terms of practical recommendations, I would usually recommend starting with GridSearchCV,', 'start': 1534.457, 'duration': 6.581}, {'end': 1548.459, 'text': 'but then switching to RandomizeSearchCV if GridSearchCV is taking longer than the time you have available.', 'start': 1541.038, 'duration': 7.421}, {'end': 1552.64, 'text': 'When using RandomizeSearchCV,', 'start': 1549.7, 'duration': 2.94}, {'end': 1566.694, 'text': 'start with a very small value of niter time how long that takes and then do the math for how large you can set the niter parameter without exceeding the amount of time you have available.', 'start': 1552.64, 'duration': 14.054}, {'end': 1578.079, 'text': 'As always, the scikit-learn documentation is helpful if you want to learn more.', 'start': 1573.237, 'duration': 4.842}, {'end': 1585.202, 'text': 'The first link is their user guide covering the topic of grid search in general,', 'start': 1579.32, 'duration': 5.882}, {'end': 1591.145, 'text': 'followed by links to the detailed documentation for grid search CV and randomized search CV.', 'start': 1585.202, 'duration': 5.943}, {'end': 1600.4, 'text': 'Next is a comparison of randomized search and grid search, also from the scikit-learn documentation,', 'start': 1593.576, 'duration': 6.824}, {'end': 1609.285, 'text': 'which provides some code that demonstrates how much time a randomized search can save you over an exhaustive grid search.', 'start': 1600.4, 'duration': 8.885}, {'end': 1620.492, 'text': 'Next is a video segment in IPython Notebook on randomized search by Andreas Mueller, who is one of the core contributors to scikit-learn.', 'start': 1611.286, 'duration': 9.206}, {'end': 1627.601, 'text': "The relevant portion of the video which I've linked to is just three minutes long,", 'start': 1621.738, 'duration': 5.863}, {'end': 1633.644, 'text': 'but the rest of the video is worth watching if you want to learn about the more advanced features of scikit-learn.', 'start': 1627.601, 'duration': 6.043}, {'end': 1643.149, 'text': 'The final resource is a paper by Yoshua Bengio, famous for his research on deep learning,', 'start': 1635.505, 'duration': 7.644}, {'end': 1648.492, 'text': 'which argues in favor of using a random search process for parameter tuning.', 'start': 1643.149, 'duration': 5.343}, {'end': 1655.401, 'text': 'As always, I appreciate you watching and look forward to your comments and questions.', 'start': 1651.317, 'duration': 4.084}], 'summary': 'Randomized searchcv often finds close to best result in a fraction of the time compared to gridsearchcv.', 'duration': 184.341, 'max_score': 1471.06, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA1471060.jpg'}], 'start': 1431.194, 'title': 'Comparing search techniques', 'summary': 'Compares randomizedsearchcv and gridsearchcv, showing that randomizedsearchcv achieves a high score of .98 in a fraction of the time with n_iter set to 10. it also explores grid search and randomized search techniques, providing various resources and advocating random search for parameter tuning.', 'chapters': [{'end': 1566.694, 'start': 1431.194, 'title': 'Randomizedsearchcv vs gridsearchcv', 'summary': 'Compares randomizedsearchcv and gridsearchcv, highlighting that randomizedsearchcv can often find the best result in a fraction of the time, as demonstrated through an experiment with n_iter set to 10, resulting in a score of .98 most of the time.', 'duration': 135.5, 'highlights': ['RandomizedSearchCV can often find the best result in a fraction of the time compared to GridSearchCV, as demonstrated through an experiment with n_iter set to 10, resulting in a score of .98 most of the time.', "It's possible that RandomizedSearchCV may not find as good a result as GridSearchCV, but it can surprise by finding the best result or something very close in a fraction of the time.", 'Practical recommendation includes starting with GridSearchCV and then switching to RandomizedSearchCV if GridSearchCV is taking longer than the available time, with a suggestion to start with a very small value of n_iter and then adjust based on the available time.']}, {'end': 1663.549, 'start': 1573.237, 'title': 'Grid search resources and comparison', 'summary': 'Explores grid search and randomized search techniques, providing links to scikit-learn user guide, detailed documentation, a comparison of time efficiency between randomized search and grid search, a video segment by andreas mueller, and a paper by yoshua bengio advocating random search process for parameter tuning.', 'duration': 90.312, 'highlights': ['The first link is the user guide covering grid search, followed by detailed documentation for grid search CV and randomized search CV, providing comprehensive resources for learning about grid search techniques.', 'A comparison of randomized search and grid search from the scikit-learn documentation demonstrates the time saved by a randomized search over an exhaustive grid search, highlighting the efficiency of randomized search.', 'A video segment in IPython Notebook by Andreas Mueller, a core contributor to scikit-learn, provides insights into randomized search, offering a valuable learning resource for advanced features of scikit-learn.', "A paper by Yoshua Bengio argues in favor of using a random search process for parameter tuning, leveraging Bengio's expertise in deep learning and supporting the effectiveness of random search for parameter optimization."]}], 'duration': 232.355, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Gol_qOgRqfA/pics/Gol_qOgRqfA1431194.jpg', 'highlights': ['RandomizedSearchCV achieves a high score of .98 in a fraction of the time with n_iter set to 10', 'RandomizedSearchCV may find the best result or something very close in a fraction of the time compared to GridSearchCV', 'Practical recommendation includes starting with GridSearchCV and then switching to RandomizedSearchCV if GridSearchCV is taking longer', 'Comprehensive resources for learning about grid search techniques are available in the user guide and detailed documentation', 'A comparison from the scikit-learn documentation demonstrates the time saved by a randomized search over an exhaustive grid search', 'A video segment in IPython Notebook by Andreas Mueller provides insights into randomized search, offering a valuable learning resource', 'A paper by Yoshua Bengio argues in favor of using a random search process for parameter tuning, supporting the effectiveness of random search']}], 'highlights': ['GridSearchCV allows you to define a set of parameters and automatically run cross-validation using each of those parameters, replacing the need for manual for loops and providing additional functionality.', 'The grid object is ready to do tenfold cross-validation on a KNN model using classification accuracy as the evaluation metric and repeat the process 30 times with different values for the nNeighbors parameter from the list.', 'The Randomized SearchCV aims to solve the computational infeasibility issue by searching only a random subset of the provided parameters and allowing explicit control over the number of different parameter combinations attempted.', 'The importance of training the model with the best known parameters using all of the data is emphasized, ensuring valuable data is not discarded and the model can learn effectively.', 'RandomizedSearchCV achieves a high score of .98 in a fraction of the time with n_iter set to 10.']}