title
Live Day 3- Discussing KNN And Naive Baye's Machine Learning Algorithms

description
Join the community session https://ineuron.ai/course/Mega-Community . Here All the materials will be uploaded. Playlist: https://www.youtube.com/watch?v=11unm2hmvOQ&list=PLZoTAELRMXVMgtxAboeAx-D9qbnY94Yay The Oneneuron Lifetime subscription has been extended. In Oneneuron platform you will be able to get 100+ courses(Monthly atleast 20 courses will be added based on your demand) Features of the course 1. You can raise any course demand.(Fulfilled within 45-60 days) 2. You can access innovation lab from ineuron. 3. You can use our incubation based on your ideas 4. Live session coming soon(Mostly till Feb) Use Coupon code KRISH10 for addition 10% discount. And Many More..... Enroll Now OneNeuron Link: https://one-neuron.ineuron.ai/ Direct call to our Team incase of any queries 8788503778 6260726925 9538303385 866003424

detail
{'title': "Live Day 3- Discussing KNN And Naive Baye's Machine Learning Algorithms", 'heatmap': [{'end': 670.206, 'start': 582.14, 'weight': 0.718}, {'end': 7137.827, 'start': 6911.9, 'weight': 0.753}], 'summary': "The live session covers machine learning topics, including linear regression, logistic regression, probability concepts, and knn algorithm, aiming for 200 likes and emphasizing engaging the audience. it provides insights into datasets with 12 features, achieves a 95% best score in logistic regression, and discusses the knn algorithm's applications, distances used, and limitations.", 'chapters': [{'end': 396.243, 'segs': [{'end': 128.866, 'src': 'embed', 'start': 30.199, 'weight': 0, 'content': [{'end': 32.42, 'text': 'hello, guys, how are you?', 'start': 30.199, 'duration': 2.221}, {'end': 34.74, 'text': 'i hope everybody is able to hear me.', 'start': 32.42, 'duration': 2.32}, {'end': 38.881, 'text': 'so we are in the third day of machine learning live sessions.', 'start': 34.74, 'duration': 4.141}, {'end': 53.405, 'text': 'so this is the third day live on machine learning algorithm machine learning algorithms.', 'start': 38.881, 'duration': 14.524}, {'end': 56.986, 'text': "So we'll go ahead with the agenda.", 'start': 55.205, 'duration': 1.781}, {'end': 60.309, 'text': 'I hope everybody is able to hear me.', 'start': 57.066, 'duration': 3.243}, {'end': 63.191, 'text': "Deepak Parwal says that I'm here to waste time.", 'start': 60.909, 'duration': 2.282}, {'end': 67.354, 'text': 'Then Deepak, please skip the session and go home.', 'start': 63.792, 'duration': 3.562}, {'end': 69.156, 'text': 'Go whatever you are doing.', 'start': 68.055, 'duration': 1.101}, {'end': 70.537, 'text': 'Do Aramse.', 'start': 69.796, 'duration': 0.741}, {'end': 73.339, 'text': "Okay I've never forced anyone to come into this session.", 'start': 70.597, 'duration': 2.742}, {'end': 74.12, 'text': 'Hit like.', 'start': 73.659, 'duration': 0.461}, {'end': 76.521, 'text': "So then I'll start writing the agenda.", 'start': 74.84, 'duration': 1.681}, {'end': 81.225, 'text': "Let's cross around 200 likes at least before we start the agenda.", 'start': 76.561, 'duration': 4.664}, {'end': 92.387, 'text': 'So, first of all, in the previous session, in the previous session, what all things we discussed?', 'start': 82.822, 'duration': 9.565}, {'end': 98.03, 'text': 'So the first thing that we discussed was linear regression.', 'start': 95.289, 'duration': 2.741}, {'end': 106.375, 'text': 'So how was the experience till two sessions guys? So second one, probably we discussed about ridge and lasso.', 'start': 98.831, 'duration': 7.544}, {'end': 110.617, 'text': 'And the third one was logistic regression.', 'start': 108.496, 'duration': 2.121}, {'end': 117.195, 'text': 'So within two days, we were able to cover all these things.', 'start': 114.352, 'duration': 2.843}, {'end': 128.866, 'text': 'So in two days, we were able to cover these things, which is pretty much amazing because these are some complex algorithms.', 'start': 120.718, 'duration': 8.148}], 'summary': '3rd day of machine learning live sessions covering linear, ridge, lasso, and logistic regression. impressive progress in 2 days.', 'duration': 98.667, 'max_score': 30.199, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q30199.jpg'}, {'end': 313.326, 'src': 'embed', 'start': 286.014, 'weight': 4, 'content': [{'end': 291.047, 'text': 'okay. So here you have to basically subscribe.', 'start': 286.014, 'duration': 5.033}, {'end': 291.968, 'text': "I've pinned the message.", 'start': 291.107, 'duration': 0.861}, {'end': 296.011, 'text': "So let's proceed and let's enjoy today's session.", 'start': 292.608, 'duration': 3.403}, {'end': 300.555, 'text': 'How do we enjoy? First of all, we enjoy by creating a practical problem.', 'start': 296.832, 'duration': 3.723}, {'end': 304.518, 'text': 'So I am actually opening a notebook file in front of you.', 'start': 301.255, 'duration': 3.263}, {'end': 313.326, 'text': 'So here we will try to solve it with the help of linear regression, ridge, lasso, and try to solve some problems.', 'start': 305.139, 'duration': 8.187}], 'summary': "Learn to solve practical problems using linear regression, ridge, and lasso in today's session.", 'duration': 27.312, 'max_score': 286.014, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q286014.jpg'}], 'start': 30.199, 'title': 'Machine learning live: recap and practical agenda', 'summary': 'Recaps the machine learning live sessions covering linear regression and introduces upcoming sessions on projects, with a goal of engaging the audience and reaching 200 likes. it also emphasizes subscribing to the hindi channel for daily content.', 'chapters': [{'end': 98.03, 'start': 30.199, 'title': 'Day 3 machine learning live: linear regression', 'summary': 'Covers the third day of machine learning live sessions, focusing on linear regression, with an emphasis on engaging the audience and reaching a target of 200 likes before commencing the agenda.', 'duration': 67.831, 'highlights': ['The session emphasizes audience engagement, setting a target of 200 likes before commencing the agenda.', 'The chapter delves into linear regression as a topic of discussion in the previous session.']}, {'end': 396.243, 'start': 98.831, 'title': 'Machine learning sessions recap and practical agenda', 'summary': 'Covered ridge, lasso, and logistic regression in two sessions, with plans for hyperparameter tuning and practicals. the session also introduced upcoming live sessions on machine learning projects and emphasized subscribing to the hindi channel for daily content.', 'duration': 297.412, 'highlights': ['The chapter covered ridge, lasso, and logistic regression in two sessions, with plans for hyperparameter tuning and practicals. The two-day session covered complex algorithms including ridge, lasso, and logistic regression, laying the foundation for understanding other algorithms. Plans for hyperparameter tuning and practical implementation were also discussed.', 'Introduction of upcoming live sessions on machine learning projects and emphasis on subscribing to the Hindi channel for daily content. The session introduced the plan for upcoming seven-day live sessions on solving machine learning projects, along with a request to subscribe to the Hindi channel for daily content, highlighting the importance of sharing the channel with friends interested in learning in Hindi.', 'Emphasis on practical problem-solving using linear regression, ridge, and lasso, aiming for better understanding. The session emphasized practical problem-solving using linear regression, ridge, and lasso, with the aim of facilitating better understanding and learning of basic concepts.']}], 'duration': 366.044, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q30199.jpg', 'highlights': ['The session emphasizes audience engagement, setting a target of 200 likes before commencing the agenda.', 'Introduction of upcoming live sessions on machine learning projects and emphasis on subscribing to the Hindi channel for daily content.', 'The chapter covered ridge, lasso, and logistic regression in two sessions, with plans for hyperparameter tuning and practicals.', 'The chapter delves into linear regression as a topic of discussion in the previous session.', 'Emphasis on practical problem-solving using linear regression, ridge, and lasso, aiming for better understanding.']}, {'end': 858.256, 'segs': [{'end': 468.829, 'src': 'embed', 'start': 437.594, 'weight': 0, 'content': [{'end': 441.175, 'text': 'Very nice session we will do and we will try to complete it.', 'start': 437.594, 'duration': 3.581}, {'end': 442.655, 'text': "Okay So let's start.", 'start': 441.295, 'duration': 1.36}, {'end': 455.798, 'text': "The first thing we'll start with linear regression, linear regression, and then we will go ahead and discuss with ridge and lasso.", 'start': 443.435, 'duration': 12.363}, {'end': 459.542, 'text': "Okay I'm just going to make this as markdown.", 'start': 456.478, 'duration': 3.064}, {'end': 468.829, 'text': 'How many different libraries for linear regression? You can do with stats, you can do with Skype, you can do with many things.', 'start': 463.485, 'duration': 5.344}], 'summary': 'Linear regression will be discussed, along with ridge and lasso techniques, using various libraries.', 'duration': 31.235, 'max_score': 437.594, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q437594.jpg'}, {'end': 531.304, 'src': 'embed', 'start': 502.906, 'weight': 1, 'content': [{'end': 506.507, 'text': 'A simple data set which is already present in sklearn only.', 'start': 502.906, 'duration': 3.601}, {'end': 513.227, 'text': 'Now in order to import the data set, I will write a line of code which is like from sklearn.datasets.', 'start': 506.927, 'duration': 6.3}, {'end': 516.769, 'text': 'Datasets import.', 'start': 515.549, 'duration': 1.22}, {'end': 520.335, 'text': 'load underscore boston.', 'start': 518.914, 'duration': 1.421}, {'end': 523.798, 'text': 'so we have some boston house pricing data set.', 'start': 520.335, 'duration': 3.463}, {'end': 525.739, 'text': "so i'm just going to execute this.", 'start': 523.798, 'duration': 1.941}, {'end': 531.304, 'text': "i'm also going to make a lot of cells so that i don't have to again go ahead and create all the cells again.", 'start': 525.739, 'duration': 5.565}], 'summary': 'Imported boston house pricing dataset from sklearn.datasets', 'duration': 28.398, 'max_score': 502.906, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q502906.jpg'}, {'end': 670.206, 'src': 'heatmap', 'start': 582.14, 'weight': 0.718, 'content': [{'end': 589.426, 'text': 'Okay, Now, in order to load this particular data set, I will just use this library called as load underscore Boston,', 'start': 582.14, 'duration': 7.286}, {'end': 590.908, 'text': "and I'm going to just initialize this.", 'start': 589.426, 'duration': 1.482}, {'end': 598.19, 'text': 'So if you press shift tab, you will be able to see that return load and return the Boston house prices data set.', 'start': 591.488, 'duration': 6.702}, {'end': 599.37, 'text': 'It is a regression problem.', 'start': 598.23, 'duration': 1.14}, {'end': 600.75, 'text': 'It is saying, okay.', 'start': 599.39, 'duration': 1.36}, {'end': 603.271, 'text': "And then probably I'm just going to execute it.", 'start': 601.09, 'duration': 2.181}, {'end': 607.432, 'text': 'Now, once I execute it, I will go and probably see the type of DF.', 'start': 604.031, 'duration': 3.401}, {'end': 611.073, 'text': 'So it is basically saying sklearn.utils.bunch.', 'start': 607.972, 'duration': 3.101}, {'end': 617.114, 'text': "Now, if I go and probably execute DF, you'll be able to see that this will be in the form of key value pairs.", 'start': 611.653, 'duration': 5.461}, {'end': 618.994, 'text': 'Okay Like target is here.', 'start': 617.534, 'duration': 1.46}, {'end': 620.375, 'text': 'Data is here.', 'start': 619.615, 'duration': 0.76}, {'end': 626.019, 'text': "Okay So data is here, target is here and probably you'll be able to find out feature names is here.", 'start': 620.615, 'duration': 5.404}, {'end': 628.921, 'text': 'Okay So we definitely require feature names.', 'start': 626.439, 'duration': 2.482}, {'end': 632.844, 'text': 'We require our target value and our data value.', 'start': 629.502, 'duration': 3.342}, {'end': 639.85, 'text': 'So we really need to combine this specific thing in a proper way in the form of a data frame so that you will be able to see this.', 'start': 633.004, 'duration': 6.846}, {'end': 644.973, 'text': "Okay So what I'm actually going to do over here, I'm just going to say pd.dataframe.", 'start': 639.99, 'duration': 4.983}, {'end': 647.615, 'text': "I'll convert this entirely into a data frame.", 'start': 645.273, 'duration': 2.342}, {'end': 650.017, 'text': 'And I will say df.data.', 'start': 648.115, 'duration': 1.902}, {'end': 655.18, 'text': 'See, this is a key value pair, right? So df.data is basically giving me all the features value.', 'start': 650.097, 'duration': 5.083}, {'end': 657.102, 'text': 'So if I write df.data.', 'start': 655.581, 'duration': 1.521}, {'end': 667.324, 'text': 'And just execute it, you will be able to see that I will be able to get my entire data set in this way, right? My entire data set in this way.', 'start': 658.979, 'duration': 8.345}, {'end': 670.206, 'text': 'This is my feature 1, feature 2, feature 3, feature 4.', 'start': 667.364, 'duration': 2.842}], 'summary': 'Using load underscore boston library to load regression dataset and convert it into a dataframe.', 'duration': 88.066, 'max_score': 582.14, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q582140.jpg'}, {'end': 647.615, 'src': 'embed', 'start': 620.615, 'weight': 2, 'content': [{'end': 626.019, 'text': "Okay So data is here, target is here and probably you'll be able to find out feature names is here.", 'start': 620.615, 'duration': 5.404}, {'end': 628.921, 'text': 'Okay So we definitely require feature names.', 'start': 626.439, 'duration': 2.482}, {'end': 632.844, 'text': 'We require our target value and our data value.', 'start': 629.502, 'duration': 3.342}, {'end': 639.85, 'text': 'So we really need to combine this specific thing in a proper way in the form of a data frame so that you will be able to see this.', 'start': 633.004, 'duration': 6.846}, {'end': 644.973, 'text': "Okay So what I'm actually going to do over here, I'm just going to say pd.dataframe.", 'start': 639.99, 'duration': 4.983}, {'end': 647.615, 'text': "I'll convert this entirely into a data frame.", 'start': 645.273, 'duration': 2.342}], 'summary': 'Combine feature names, target, and data into a dataframe.', 'duration': 27, 'max_score': 620.615, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q620615.jpg'}, {'end': 783.549, 'src': 'embed', 'start': 754.843, 'weight': 3, 'content': [{'end': 758.625, 'text': 'if i go and execute without print, you will be able to see my entire data set.', 'start': 754.843, 'duration': 3.782}, {'end': 762.726, 'text': 'So these are my features with respect to different, different things.', 'start': 759.285, 'duration': 3.441}, {'end': 764.946, 'text': 'I hope everybody is able to understand.', 'start': 763.306, 'duration': 1.64}, {'end': 768.507, 'text': 'Okay And this is basically a house pricing data set.', 'start': 764.966, 'duration': 3.541}, {'end': 780.01, 'text': 'So initially I have this features CRM, ZN, Indus, Chas, Knox, RM, Age, Distance, Radius, Tax, PT Ratio, BL Stat.', 'start': 768.527, 'duration': 11.483}, {'end': 783.549, 'text': 'Okay, so I have my entire data set over here.', 'start': 780.686, 'duration': 2.863}], 'summary': 'The dataset includes features such as crm, zn, indus, chas, knox, rm, age, distance, radius, tax, pt ratio, and bl stat for a house pricing dataset.', 'duration': 28.706, 'max_score': 754.843, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q754843.jpg'}], 'start': 396.243, 'title': 'Linear regression, boston house pricing, dataframe creation, and feature extraction', 'summary': 'Covers the preparation and discussion of linear regression, importing the boston house pricing dataset, creating a data frame, and extracting feature names and target values, providing insights into a dataset with 12 features.', 'chapters': [{'end': 598.19, 'start': 396.243, 'title': 'Linear regression and boston house pricing', 'summary': 'Covers the preparation for a session on linear regression and the discussion of importing the boston house pricing data set from sklearn, including the use of necessary libraries and initializing the data set.', 'duration': 201.947, 'highlights': ['Preparation for a session on linear regression The instructor emphasizes the preparation for a seven-day live session on linear regression, indicating the focus on linear regression, ridge, and lasso, and the use of different libraries for linear regression.', 'Importing the Boston house pricing data set from sklearn The process of importing the Boston house pricing data set from sklearn is discussed, including the initialization of the data set and the use of necessary libraries such as numpy, pandas, seaborn, and matplotlib.pyplot.']}, {'end': 858.256, 'start': 598.23, 'title': 'Dataframe creation and feature extraction', 'summary': 'Discusses the creation of a data frame from a regression dataset, including extracting feature names and target values, and provides insights into the house pricing dataset with 12 features and their meanings.', 'duration': 260.026, 'highlights': ['The transcript discusses creating a data frame from a regression dataset and extracting feature names and target values. The speaker explains the process of converting a dataset into a data frame and extracting the feature names and target values using the pandas library.', 'Insights into the house pricing dataset are provided, including details about the 12 features and their meanings. The transcript includes explanations about the features in the house pricing dataset, such as CRM, ZN, Indus, Chas, Knox, RM, Age, Distance, Radius, Tax, PT Ratio, and BL Stat, along with their respective meanings.']}], 'duration': 462.013, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q396243.jpg', 'highlights': ['The instructor emphasizes the preparation for a seven-day live session on linear regression, ridge, and lasso, and the use of different libraries for linear regression.', 'Importing the Boston house pricing data set from sklearn is discussed, including the initialization of the data set and the use of necessary libraries such as numpy, pandas, seaborn, and matplotlib.pyplot.', 'The transcript discusses creating a data frame from a regression dataset and extracting feature names and target values.', 'Insights into the house pricing dataset are provided, including details about the 12 features and their meanings.']}, {'end': 1954.184, 'segs': [{'end': 960.796, 'src': 'embed', 'start': 930.85, 'weight': 2, 'content': [{'end': 936.713, 'text': 'this target value, this target value is basically the sale, the price of the houses.', 'start': 930.85, 'duration': 5.863}, {'end': 938.394, 'text': 'right?. It is again in the form of array.', 'start': 936.713, 'duration': 1.681}, {'end': 941.676, 'text': "So I'm going to take this and put it as a dependent feature.", 'start': 939.094, 'duration': 2.582}, {'end': 945.757, 'text': "So here, you'll be able to see that my price will be my dependent feature.", 'start': 941.716, 'duration': 4.041}, {'end': 948.339, 'text': "So here, I'll basically write df dot target.", 'start': 945.818, 'duration': 2.521}, {'end': 953.156, 'text': 'So once I execute it and now if I probably go and see my dataset.head,', 'start': 948.979, 'duration': 4.177}, {'end': 960.796, 'text': 'you will be able to see features over here and one more feature is getting added, that is price.', 'start': 954.912, 'duration': 5.884}], 'summary': 'The target value, representing house prices, is used as a dependent feature in the dataset.', 'duration': 29.946, 'max_score': 930.85, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q930850.jpg'}, {'end': 1564.874, 'src': 'embed', 'start': 1540.264, 'weight': 1, 'content': [{'end': 1545.748, 'text': "Okay And then probably I'll print, I will print my MS mean underscore MSC.", 'start': 1540.264, 'duration': 5.484}, {'end': 1548.871, 'text': 'So this will be my average score with respect to this.', 'start': 1546.549, 'duration': 2.322}, {'end': 1553.034, 'text': 'The negative value is there because we have used negative mean squared error.', 'start': 1549.571, 'duration': 3.463}, {'end': 1557.665, 'text': 'But if you just consider mean squared error, then it is only 37.13.', 'start': 1553.094, 'duration': 4.571}, {'end': 1560.609, 'text': 'Okay, so this I have actually shown you how to do cross validation.', 'start': 1557.665, 'duration': 2.944}, {'end': 1564.874, 'text': "See, with respect to linear regression, you can't modify much with the parameter.", 'start': 1560.769, 'duration': 4.105}], 'summary': 'Average score is -37.13 with negative mean squared error, demonstrating cross validation for linear regression.', 'duration': 24.61, 'max_score': 1540.264, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q1540264.jpg'}, {'end': 1672.432, 'src': 'embed', 'start': 1645.695, 'weight': 0, 'content': [{'end': 1649.777, 'text': 'okay, so for the ridge it is also present in linear underscore model.', 'start': 1645.695, 'duration': 4.082}, {'end': 1657.362, 'text': 'for doing the hyper parameter tuning i will be using from sk learn dot model underscore selection.', 'start': 1649.777, 'duration': 7.585}, {'end': 1661.644, 'text': "okay, and then i'm going to import grid search cb.", 'start': 1657.362, 'duration': 4.282}, {'end': 1664.926, 'text': "okay, so these are the two libraries that i'm actually going to reuse.", 'start': 1661.644, 'duration': 3.282}, {'end': 1667.588, 'text': 'grid search cb will be able to help you out with.', 'start': 1664.926, 'duration': 2.662}, {'end': 1672.432, 'text': 'uh, Okay, I will be able to help you out with hyper parameter tuning.', 'start': 1667.588, 'duration': 4.844}], 'summary': 'Using grid search cv for hyperparameter tuning in ridge model.', 'duration': 26.737, 'max_score': 1645.695, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q1645695.jpg'}], 'start': 858.296, 'title': 'Linear and ridge regression', 'summary': 'Covers linear regression data preparation, including defining features and data splitting, and showcases cross validation with an average mean squared error score of 37.13 after 5-fold cross validation. it also explains ridge regression and hyperparameter tuning through grid search cv.', 'chapters': [{'end': 1133.514, 'start': 858.296, 'title': 'Linear regression data preparation', 'summary': 'Discusses the preparation of data for linear regression, including defining independent and dependent features, and dividing the dataset into train and test sets, while using iloc to select specific columns and skip the last feature.', 'duration': 275.218, 'highlights': ["The chapter demonstrates the creation of a new feature, 'price', representing the price of houses, and assigns values to this target, which is in the form of an array.", 'It explains the process of dividing the dataset into independent and dependent features using iloc, specifying columns for independent features and selecting the last column for the dependent feature.', 'The speaker emphasizes the importance of properly dividing independent and dependent features for solving linear regression, and discusses the use of iloc to skip the last feature while selecting columns for independent features.']}, {'end': 1560.609, 'start': 1133.514, 'title': 'Linear regression and cross validation', 'summary': 'Covers the implementation of linear regression and cross validation for model evaluation, emphasizing the importance of libraries and hyperparameter tuning, and showcases the process of cross validation with an average mean squared error score of 37.13 after 5-fold cross validation.', 'duration': 427.095, 'highlights': ['The chapter emphasizes the importance of libraries and hyperparameter tuning when implementing linear regression, showcasing the process of cross validation with an average mean squared error score of 37.13 after 5-fold cross validation. The chapter stresses the significance of libraries and hyperparameter tuning in implementing linear regression, followed by demonstrating the process of cross validation with an average mean squared error score of 37.13 after 5-fold cross validation.', 'The speaker discusses the process of cross validation and its significance in model evaluation, explaining the iteration through different combinations of train and test data for improved accuracy. The speaker explains the significance of cross validation in model evaluation, iterating through different combinations of train and test data for improved accuracy.', 'The speaker emphasizes the flexibility of cross validation, allowing for the adjustment of the cross-validation value, and encourages the audience to experiment with multiple cross-validation values. The speaker underlines the flexibility of cross validation, allowing for adjustment of the cross-validation value, and encourages experimentation with multiple cross-validation values.']}, {'end': 1954.184, 'start': 1560.769, 'title': 'Ridge regression and hyperparameter tuning', 'summary': 'Demonstrates the process of ridge regression and hyperparameter tuning through grid search cv, explaining the use of alpha and max iteration in ridge regression and the use of grid search cv for hyperparameter tuning.', 'duration': 393.415, 'highlights': ['The chapter demonstrates the process of ridge regression and hyperparameter tuning through grid search CV. The chapter focuses on ridge regression and hyperparameter tuning using grid search CV.', 'The transcript explains the use of alpha and max iteration in ridge regression for hyperparameter tuning. The use of alpha and max iteration in ridge regression for hyperparameter tuning is highlighted.', 'The speaker mentions specific alpha values for hyperparameter tuning in ridge regression, such as 1e-15, 1e-10, 1e-8, and 1e-3. Specific alpha values for hyperparameter tuning in ridge regression, such as 1e-15, 1e-10, 1e-8, and 1e-3, are mentioned.', 'The process of defining parameters for grid search CV is detailed, including the use of dictionaries to define alpha values. The process of defining parameters for grid search CV, including the use of dictionaries to define alpha values, is explained.']}], 'duration': 1095.888, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q858296.jpg', 'highlights': ['The chapter demonstrates the process of ridge regression and hyperparameter tuning through grid search CV.', 'The chapter emphasizes the importance of libraries and hyperparameter tuning when implementing linear regression, showcasing the process of cross validation with an average mean squared error score of 37.13 after 5-fold cross validation.', "The chapter demonstrates the creation of a new feature, 'price', representing the price of houses, and assigns values to this target, which is in the form of an array."]}, {'end': 3352.885, 'segs': [{'end': 1978.488, 'src': 'embed', 'start': 1954.184, 'weight': 0, 'content': [{'end': 1960.67, 'text': 'Okay, and then probably I can have 1, 5, 10, 20, something like this.', 'start': 1954.184, 'duration': 6.486}, {'end': 1963.713, 'text': "So I'm going to play with all this particular parameters for right now,", 'start': 1961.031, 'duration': 2.682}, {'end': 1972.622, 'text': 'because in GRITS or CV what they do is that they take all the combination of this alpha value and wherever your model performs well,', 'start': 1963.713, 'duration': 8.909}, {'end': 1975.825, 'text': 'it is going to take that specific parameter and it is going to give you that.', 'start': 1972.622, 'duration': 3.203}, {'end': 1978.488, 'text': 'Okay, this is the best fit parameter that is got selected.', 'start': 1976.165, 'duration': 2.323}], 'summary': 'Experimenting with alpha values for model optimization.', 'duration': 24.304, 'max_score': 1954.184, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q1954184.jpg'}, {'end': 2034.185, 'src': 'embed', 'start': 2004.097, 'weight': 1, 'content': [{'end': 2010.799, 'text': "Okay So here I have a grid, sorry, grid, grid, I'm saying.", 'start': 2004.097, 'duration': 6.702}, {'end': 2013.76, 'text': 'Ridge underscore regressor.', 'start': 2012.72, 'duration': 1.04}, {'end': 2015.941, 'text': "So I'm going to use grid search CV.", 'start': 2014.02, 'duration': 1.921}, {'end': 2022.65, 'text': "GridSourceCV, and here I'm basically going to take the parameters Ridge okay,", 'start': 2018.362, 'duration': 4.288}, {'end': 2027.58, 'text': 'Ridge is my first model and then I will take up all these params that I have actually defined.', 'start': 2022.65, 'duration': 4.93}, {'end': 2034.185, 'text': 'See, in GridSourceCV, if I press Shift tab, Come on, just a second.', 'start': 2027.62, 'duration': 6.565}], 'summary': 'Using gridsearchcv to optimize parameters for ridge regression model.', 'duration': 30.088, 'max_score': 2004.097, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q2004097.jpg'}, {'end': 2616.962, 'src': 'embed', 'start': 2581.258, 'weight': 2, 'content': [{'end': 2584.84, 'text': 'You can also add and just try to execute.', 'start': 2581.258, 'duration': 3.582}, {'end': 2595.403, 'text': 'And now if I go and probably see this is my..', 'start': 2585.92, 'duration': 9.483}, {'end': 2599.105, 'text': "First I've tried for ridge, I'm getting minus 29.", 'start': 2595.403, 'duration': 3.702}, {'end': 2602.466, 'text': 'Do you see, after adding more parameters, what happened in ridge?', 'start': 2599.105, 'duration': 3.361}, {'end': 2605.747, 'text': 'After adding more parameters, what happened in ridge?', 'start': 2603.466, 'duration': 2.281}, {'end': 2612.699, 'text': 'After adding more parameters in ridge, what happened?', 'start': 2610.197, 'duration': 2.502}, {'end': 2616.962, 'text': 'You can see OM minus 29..', 'start': 2613.219, 'duration': 3.743}], 'summary': "After adding more parameters, ridge regression's om decreased to -29.", 'duration': 35.704, 'max_score': 2581.258, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q2581258.jpg'}, {'end': 3164.652, 'src': 'embed', 'start': 3138.006, 'weight': 3, 'content': [{'end': 3143.509, 'text': 'R2 score is there but adjusted R2 should be here somewhere in some manner.', 'start': 3138.006, 'duration': 5.503}, {'end': 3154.466, 'text': 'okay. so this is how your output looks like with respect to by using this lasso regressor.', 'start': 3148.263, 'duration': 6.203}, {'end': 3156.127, 'text': 'okay, which is very good.', 'start': 3154.466, 'duration': 1.661}, {'end': 3157.088, 'text': 'okay, it should be.', 'start': 3156.127, 'duration': 0.961}, {'end': 3158.709, 'text': 'i told it should be near 100.', 'start': 3157.088, 'duration': 1.621}, {'end': 3161.19, 'text': "right now i'm getting 67.", 'start': 3158.709, 'duration': 2.481}, {'end': 3164.652, 'text': 'if i want to tie with the ridge, you can also try that.', 'start': 3161.19, 'duration': 3.462}], 'summary': 'Adjust r2 score should be near 100, currently at 67, aim to improve using lasso regressor and ridge.', 'duration': 26.646, 'max_score': 3138.006, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3138006.jpg'}, {'end': 3266.079, 'src': 'embed', 'start': 3233.457, 'weight': 5, 'content': [{'end': 3239.261, 'text': 'so for that you basically have to other use other algorithms like xg boost and all nape bias.', 'start': 3233.457, 'duration': 5.804}, {'end': 3241.882, 'text': 'so many algorithms are there.', 'start': 3239.261, 'duration': 2.621}, {'end': 3247.286, 'text': "so clear guys, with this three now let's go ahead and discuss about logistic.", 'start': 3241.882, 'duration': 5.404}, {'end': 3253.684, 'text': 'or should i should i continue?', 'start': 3251.281, 'duration': 2.403}, {'end': 3254.886, 'text': 'uh, theory or practical?', 'start': 3253.684, 'duration': 1.202}, {'end': 3256.648, 'text': 'should i move to nape bias?', 'start': 3254.886, 'duration': 1.762}, {'end': 3259.231, 'text': 'probably logistic we can do tomorrow.', 'start': 3256.648, 'duration': 2.583}, {'end': 3259.831, 'text': 'you just tell me.', 'start': 3259.231, 'duration': 0.6}, {'end': 3265.658, 'text': "okay, i'll just wait for your answer.", 'start': 3259.831, 'duration': 5.827}, {'end': 3266.079, 'text': "it's okay.", 'start': 3265.658, 'duration': 0.421}], 'summary': 'Discussion on using various algorithms like xgboost and logistic regression for analysis.', 'duration': 32.622, 'max_score': 3233.457, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3233457.jpg'}], 'start': 1954.184, 'title': 'Machine learning techniques', 'summary': 'Discusses grid search cv for parameter selection, achieving an optimized model performance. it also covers implementing ridge and lasso regression, leading to improved scores. additionally, it introduces train test split and predictive modeling in sklearn, aiming for high performance and predictive accuracy.', 'chapters': [{'end': 2124.546, 'start': 1954.184, 'title': 'Grid search cv for parameter selection', 'summary': 'Discusses the process of applying grid search cv to select the best fit parameter for the ridge regressor model, using specified parameters and scoring metrics, aiming to optimize model performance.', 'duration': 170.362, 'highlights': ['The process involves playing with specific parameters such as 1, 5, 10, 20 to identify the best fit parameter for the model in GRITS or CV.', 'Applying grid search CV to select the best fit parameter for the Ridge regressor model using parameters and scoring metrics, such as negative mean squared error, to optimize model performance.', 'Emphasizing the importance of executing train test split on x and y before applying ridge regressor dot fit on x comma y, highlighting the flexibility in choice for this step.']}, {'end': 2728.515, 'start': 2125.766, 'title': 'Implementing ridge and lasso regression', 'summary': 'Discusses implementing ridge and lasso regression, where ridge regression leads to a decrease in the negative mean squared error from -37 to -32, while lasso regression leads to a score of -35. further, adding more parameters results in an improved performance with ridge regression, achieving a minimum error of -29.', 'duration': 602.749, 'highlights': ['Ridge regression leads to a decrease in the negative mean squared error from -37 to -32. The negative mean squared error decreases from -37 to -32 when implementing Ridge regression, indicating an improvement in performance.', 'Lasso regression results in a score of -35. Lasso regression yields a score of -35, indicating its performance in the given scenario.', 'Adding more parameters results in an improved performance with Ridge regression, achieving a minimum error of -29. By adding more parameters, the performance of Ridge regression improves, achieving a minimum error of -29, showcasing the impact of parameter tuning on model performance.']}, {'end': 3015.379, 'start': 2731.802, 'title': 'Teaching train test split in machine learning', 'summary': 'Introduces train test split in machine learning by explaining the process and demonstrating practical implementation using sklearn.modelselection, with a focus on achieving a performance score close to zero.', 'duration': 283.577, 'highlights': ['The process of train test split using sklearn.modelselection is demonstrated, with a focus on achieving a performance score close to zero, which indicates better performance in machine learning models.', 'The implementation involves using X train and Y train data with a test size of 0.33, resulting in 32% test data and 77% train data, with an aim to improve performance towards zero.', 'Practical demonstration of train test split in machine learning includes examples of obtaining performance scores for ridge, with an emphasis on achieving improvement towards zero performance score for better model performance.']}, {'end': 3352.885, 'start': 3015.379, 'title': 'Predictive modeling in sklearn', 'summary': 'Explains using lasso regressor to predict y test values, obtaining r2 score of 0.67, comparing different regressors, limitations of linear regression, and planning to move on to logistic regression.', 'duration': 337.506, 'highlights': ['Using lasso regressor to predict y test values and obtaining an R2 score of 0.67.', 'Comparing different regressors such as ridge regressor and linear regressor, with R2 score around 68 percent.', 'Highlighting the limitations of linear regression and the need to explore other algorithms like xg boost and logistic regression.', 'Planning to move on to logistic regression and mentioning the implementation of theoretical concepts in practical sessions.']}], 'duration': 1398.701, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q1954184.jpg', 'highlights': ['The process involves playing with specific parameters such as 1, 5, 10, 20 to identify the best fit parameter for the model in GRITS or CV.', 'Applying grid search CV to select the best fit parameter for the Ridge regressor model using parameters and scoring metrics, such as negative mean squared error, to optimize model performance.', 'Adding more parameters results in an improved performance with Ridge regression, achieving a minimum error of -29.', 'Using lasso regressor to predict y test values and obtaining an R2 score of 0.67.', 'Comparing different regressors such as ridge regressor and linear regressor, with R2 score around 68 percent.', 'Highlighting the limitations of linear regression and the need to explore other algorithms like xg boost and logistic regression.']}, {'end': 4260.707, 'segs': [{'end': 3419.491, 'src': 'embed', 'start': 3380.35, 'weight': 5, 'content': [{'end': 3382.071, 'text': "So I'm going to use logistic regression.", 'start': 3380.35, 'duration': 1.721}, {'end': 3389.016, 'text': "And apart from that, we know that let's take a new data set because for logistic, we need to solve using classification problem.", 'start': 3382.251, 'duration': 6.765}, {'end': 3392.098, 'text': 'Okay So this is basically my logistic regression.', 'start': 3389.236, 'duration': 2.862}, {'end': 3393.479, 'text': "I'll take one data set.", 'start': 3392.158, 'duration': 1.321}, {'end': 3395.921, 'text': 'So from sklearn.linear.', 'start': 3394.84, 'duration': 1.081}, {'end': 3408.129, 'text': "data sets import, we'll take a data set which is like breast cancer data set.", 'start': 3398.562, 'duration': 9.567}, {'end': 3419.491, 'text': "so that is also present in sk learn, okay, so with respect to the breast cancer data set, i'm just going to use this see, load, best answer data set.", 'start': 3408.129, 'duration': 11.362}], 'summary': 'Using logistic regression for breast cancer classification with sklearn.linear dataset.', 'duration': 39.141, 'max_score': 3380.35, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3380350.jpg'}, {'end': 3728.263, 'src': 'embed', 'start': 3700.983, 'weight': 4, 'content': [{'end': 3704.726, 'text': 'okay, with respect to whatever things we have discussed in logistic.', 'start': 3700.983, 'duration': 3.743}, {'end': 3706.587, 'text': 'okay. and then the c value.', 'start': 3704.726, 'duration': 1.861}, {'end': 3709.45, 'text': 'these two parameter values are very much important.', 'start': 3706.587, 'duration': 2.863}, {'end': 3713.993, 'text': 'if i probably show you over here the penalty, what kind of penalty?', 'start': 3709.45, 'duration': 4.543}, {'end': 3718.356, 'text': 'whether you want to add a l2 penalty, l1 penalty, you can use l2 or l1.', 'start': 3713.993, 'duration': 4.363}, {'end': 3720.177, 'text': 'the next thing is c.', 'start': 3718.356, 'duration': 1.821}, {'end': 3722.339, 'text': 'this is nothing but inverse of regularization.', 'start': 3720.177, 'duration': 2.162}, {'end': 3723.52, 'text': 'strength, okay.', 'start': 3722.339, 'duration': 1.181}, {'end': 3728.263, 'text': 'So this basically says 1 by lambda, something like that, okay.', 'start': 3724.22, 'duration': 4.043}], 'summary': 'Logistic regression parameters, including penalty and regularization, are crucial for model performance.', 'duration': 27.28, 'max_score': 3700.983, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3700983.jpg'}, {'end': 3772.141, 'src': 'embed', 'start': 3747.125, 'weight': 3, 'content': [{'end': 3752.328, 'text': 'If probably your data set is balanced, you can directly use class weight is equal to balanced.', 'start': 3747.125, 'duration': 5.203}, {'end': 3756.55, 'text': 'Okay Other than that, you can use other, other weights, which you basically want.', 'start': 3752.728, 'duration': 3.822}, {'end': 3763.193, 'text': 'Okay So this is specifically some of this, right? No, this is not ridge or lasso.', 'start': 3757.331, 'duration': 5.862}, {'end': 3764.735, 'text': 'Okay, this is logistic.', 'start': 3763.774, 'duration': 0.961}, {'end': 3768.297, 'text': 'In logistic also you have L1 norm and L2 norms.', 'start': 3765.655, 'duration': 2.642}, {'end': 3772.141, 'text': 'Understand, probably I missed that particular part in the theory.', 'start': 3769.498, 'duration': 2.643}], 'summary': 'Use class weight balanced for logistic regression with l1 and l2 norms.', 'duration': 25.016, 'max_score': 3747.125, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3747125.jpg'}, {'end': 3993.979, 'src': 'embed', 'start': 3930.017, 'weight': 0, 'content': [{'end': 3933.884, 'text': 'So once I execute it, here you can see all the output along with warnings.', 'start': 3930.017, 'duration': 3.867}, {'end': 3938.432, 'text': "A lot of warnings will be coming, I don't know, because this many parameters are there.", 'start': 3933.904, 'duration': 4.528}, {'end': 3950.275, 'text': 'And finally, you can see that this has got selected, okay? Now if you really want to find out what is your best params score, model.bestparams.', 'start': 3938.993, 'duration': 11.282}, {'end': 3954.878, 'text': 'So here you can see max iteration is 150.', 'start': 3950.896, 'duration': 3.982}, {'end': 3965.886, 'text': 'And what you can actually do with respect to your best score, model.bestscore is 95%.', 'start': 3954.878, 'duration': 11.008}, {'end': 3968.368, 'text': 'But still we want to test it with test data.', 'start': 3965.886, 'duration': 2.482}, {'end': 3971.05, 'text': 'So can we do it? Yes, we can definitely do it.', 'start': 3968.808, 'duration': 2.242}, {'end': 3972.251, 'text': 'I will say model.', 'start': 3971.07, 'duration': 1.181}, {'end': 3989.017, 'text': "score, or I'll say model dot predict on my x test data, and this will basically be my y pred.", 'start': 3974.534, 'duration': 14.483}, {'end': 3993.979, 'text': "right. so this will be my y pred, all the y prediction that I'm actually getting.", 'start': 3989.017, 'duration': 4.962}], 'summary': 'Model achieved 95% best score with 150 max iterations, and is ready for testing.', 'duration': 63.962, 'max_score': 3930.017, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3930017.jpg'}, {'end': 4127.404, 'src': 'embed', 'start': 4098.313, 'weight': 2, 'content': [{'end': 4102.216, 'text': 'Since this is a balanced data set, obviously the performance will be best.', 'start': 4098.313, 'duration': 3.903}, {'end': 4114.615, 'text': 'Okay Everybody clear? Yes, you can also use ROC.', 'start': 4107.54, 'duration': 7.075}, {'end': 4119.118, 'text': "See, I'll also show you how to use ROC and probably you'll be able to see this.", 'start': 4115.154, 'duration': 3.964}, {'end': 4124.943, 'text': "You have to probably calculate false positive rate to positive rate, but don't worry about ROC.", 'start': 4120.46, 'duration': 4.483}, {'end': 4127.404, 'text': 'I will first of all explain you the theoretical part.', 'start': 4124.962, 'duration': 2.442}], 'summary': 'Balanced data set yields best performance; roc can be used for analysis.', 'duration': 29.091, 'max_score': 4098.313, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q4098313.jpg'}], 'start': 3352.885, 'title': 'Implementing logistic regression and model evaluation', 'summary': 'Covers implementing logistic regression on the breast cancer dataset, emphasizing classification, balancing the dataset with 357 ones and 212 zeros, discussing logistic regression parameters, and implementing grid search cross validation with a best score of 95% and model evaluation metrics.', 'chapters': [{'end': 3454.496, 'start': 3352.885, 'title': 'Logistic regression implementation', 'summary': 'Covers the implementation of logistic regression using the breast cancer dataset from sklearn, emphasizing the importance of understanding and determining the presence of cancer based on a set of input features, with a focus on teaching and utilizing logistic regression for classification problems.', 'duration': 101.611, 'highlights': ['The chapter focuses on quickly implementing logistic regression using the breast cancer dataset from sklearn, emphasizing the importance of understanding and determining the presence of cancer based on input features, with a focus on teaching and utilizing logistic regression for classification problems.', 'The instructor emphasizes giving 100% effort to teach and expects students to attend previous classes for understanding.', 'The process involves importing logistic regression from sklearn.linear_model and utilizing the breast cancer dataset for classification, with independent features being determined as input for determining the presence of cancer.']}, {'end': 3678.764, 'start': 3461.311, 'title': 'Data set balancing and train test split', 'summary': 'Covers the practical application of creating a dependent feature from a dataset, checking for dataset balance, and performing a train-test split, with 357 ones and 212 zeros in the target feature, indicating a balanced dataset.', 'duration': 217.453, 'highlights': ["Creating a dependent feature 'Y' using 'df.target' and checking for balanced dataset with 357 ones and 212 zeros in the target feature. The process involves creating the dependent feature 'Y' using 'df.target' and checking for a balanced dataset with 357 ones and 212 zeros in the target feature.", "Explaining the process of checking for dataset balance by using 'value_counts' to count the occurrences of ones and zeros in the target feature. The speaker explains the process of checking for dataset balance by using 'value_counts' to count the occurrences of ones and zeros in the target feature.", 'Mentioning the intention to perform train test split after checking the dataset balance. The speaker mentions the intention to perform a train test split after checking the dataset balance.']}, {'end': 3820.402, 'start': 3678.764, 'title': 'Logistic regression parameters and class weights', 'summary': 'Discusses the key parameters of logistic regression, including l1 and l2 norms, as well as the importance of class weights for imbalanced datasets.', 'duration': 141.638, 'highlights': ['The parameters l1 and l2 norms in logistic regression are important, with l2 being the inverse of regularization strength and l1 being used for regularization.', 'Applying class weights is crucial for handling imbalanced datasets in logistic regression.', 'Logistic regression can be learned through probabilistic and geometric methods, with l1 and l2 norms playing a role in classification problems.']}, {'end': 4260.707, 'start': 3820.402, 'title': 'Grid search cv for logistic regression', 'summary': "Covers the implementation of grid search cross validation for logistic regression, achieving a best score of 95% with max iteration of 150, and further evaluates the model's performance using confusion matrix, accuracy score, precision recall, and f1 score.", 'duration': 440.305, 'highlights': ["The best score achieved using Grid Search Cross Validation with logistic regression is 95% with max iteration set to 150. The model's performance is quantified with a best score of 95% using Grid Search Cross Validation.", "The model's accuracy score is determined to be 96%. The accuracy score of the logistic regression model is quantified at 96%.", "The chapter emphasizes the importance of precision, recall, and F1 score in evaluating the model's performance for the balanced data set. The evaluation of the model's performance highlights the significance of precision, recall, and F1 score, particularly for a balanced data set."]}], 'duration': 907.822, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q3352885.jpg', 'highlights': ['The best score achieved using Grid Search Cross Validation with logistic regression is 95% with max iteration set to 150.', "The model's accuracy score is determined to be 96%.", "The chapter emphasizes the importance of precision, recall, and F1 score in evaluating the model's performance for the balanced data set.", 'Applying class weights is crucial for handling imbalanced datasets in logistic regression.', 'The parameters l1 and l2 norms in logistic regression are important, with l2 being the inverse of regularization strength and l1 being used for regularization.', 'The process involves importing logistic regression from sklearn.linear_model and utilizing the breast cancer dataset for classification, with independent features being determined as input for determining the presence of cancer.']}, {'end': 4766.483, 'segs': [{'end': 4362.971, 'src': 'embed', 'start': 4262.528, 'weight': 0, 'content': [{'end': 4266.771, 'text': 'Okay So, with respect to Bayes theorem, we will try to understand what all things we need to discuss.', 'start': 4262.528, 'duration': 4.243}, {'end': 4270.913, 'text': 'Okay So, let us go ahead and let us talk about this Bayes theorem.', 'start': 4267.511, 'duration': 3.402}, {'end': 4278.038, 'text': 'Let us say that guys, I have Bayes theorem.', 'start': 4272.214, 'duration': 5.824}, {'end': 4286.427, 'text': 'Let us say that I have an experiment which is called as rolling a dice.', 'start': 4278.538, 'duration': 7.889}, {'end': 4290.753, 'text': 'now, in rolling a dice, how many number of elements i have?', 'start': 4286.427, 'duration': 4.326}, {'end': 4299.265, 'text': 'okay, so if i say what is the probability of one, can anybody tell me what is the probability of one?', 'start': 4290.753, 'duration': 8.512}, {'end': 4305.985, 'text': 'What is the probability of 1 coming in this when I roll a dice?', 'start': 4302.742, 'duration': 3.243}, {'end': 4308.167, 'text': "Then obviously you'll be saying 1 by 6..", 'start': 4306.725, 'duration': 1.442}, {'end': 4311.99, 'text': "If I say probability of 2, then also here you'll say 1 by 6.", 'start': 4308.167, 'duration': 3.823}, {'end': 4317.334, 'text': 'If I say probability of 3, then I will definitely say it is 1 by 6.', 'start': 4311.99, 'duration': 5.344}, {'end': 4323.059, 'text': 'So here you know that this kind of events are basically called as independent events.', 'start': 4317.334, 'duration': 5.725}, {'end': 4325.621, 'text': 'We discussed this in probability.', 'start': 4324.18, 'duration': 1.441}, {'end': 4327.503, 'text': 'Independent events.', 'start': 4326.582, 'duration': 0.921}, {'end': 4333.667, 'text': 'If you have attended my 7 days live session on stats, you have understood that this is basically independent events.', 'start': 4327.603, 'duration': 6.064}, {'end': 4335.468, 'text': 'Now rolling a dice.', 'start': 4334.427, 'duration': 1.041}, {'end': 4336.969, 'text': 'why it is called as an independent event?', 'start': 4335.468, 'duration': 1.501}, {'end': 4343.954, 'text': 'Because getting 1 or 2 in every experiment, 1 is not dependent on 2, 2 is not dependent on 3..', 'start': 4337.009, 'duration': 6.945}, {'end': 4345.615, 'text': 'So they are all independent.', 'start': 4343.954, 'duration': 1.661}, {'end': 4350.058, 'text': 'So that is the reason why we specifically say it is an independent event.', 'start': 4346.656, 'duration': 3.402}, {'end': 4352.319, 'text': 'But if I take an example of dependent events.', 'start': 4350.118, 'duration': 2.201}, {'end': 4357.465, 'text': "Okay If I probably take an example of dependent events, let's say what is a dependent event.", 'start': 4353.26, 'duration': 4.205}, {'end': 4362.971, 'text': "Okay Let's consider that I have a bag of marbles.", 'start': 4358.366, 'duration': 4.605}], 'summary': 'Exploring bayes theorem and probability with examples of independent and dependent events.', 'duration': 100.443, 'max_score': 4262.528, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q4262528.jpg'}, {'end': 4766.483, 'src': 'embed', 'start': 4730.439, 'weight': 2, 'content': [{'end': 4733.901, 'text': 'Okay, and this is the crux behind Naive bias.', 'start': 4730.439, 'duration': 3.462}, {'end': 4742.287, 'text': 'Okay, this is the crux behind Naive bias.', 'start': 4739.585, 'duration': 2.702}, {'end': 4745.81, 'text': 'I hope everybody is able to understand it.', 'start': 4744.089, 'duration': 1.721}, {'end': 4752.507, 'text': 'Okay, the crux, the entire theorem that is specifically getting used is this.', 'start': 4747.341, 'duration': 5.166}, {'end': 4757.733, 'text': 'Okay, so understand this is the crux behind the Bayes theorem.', 'start': 4753.569, 'duration': 4.164}, {'end': 4763.1, 'text': 'Now, let us go ahead and let us discuss about how we are using this to solve.', 'start': 4758.474, 'duration': 4.626}, {'end': 4766.483, 'text': 'okay how we are using this to solve.', 'start': 4764.321, 'duration': 2.162}], 'summary': 'Crux behind naive bias and bayes theorem explained.', 'duration': 36.044, 'max_score': 4730.439, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q4730439.jpg'}], 'start': 4262.528, 'title': 'Probability concepts', 'summary': 'Covers the understanding of bayes theorem through a dice rolling experiment with equal probabilities, and further delves into independent and dependent events, conditional probability, and the application of bayes theorem in naive bayes.', 'chapters': [{'end': 4317.334, 'start': 4262.528, 'title': 'Understanding bayes theorem', 'summary': 'Introduces the concept of bayes theorem by discussing the probability of different outcomes in a dice rolling experiment, where each number has a probability of 1/6.', 'duration': 54.806, 'highlights': ['The experiment discussed is rolling a dice, where the probability of each number (1,2,3, etc.) appearing is 1/6.', 'The concept of Bayes theorem is introduced in the context of understanding probabilities in the dice rolling experiment.']}, {'end': 4766.483, 'start': 4317.334, 'title': 'Independent and dependent events', 'summary': 'Explains the concept of independent and dependent events in probability, using examples of rolling dice and drawing marbles from a bag, and further delves into conditional probability, bayes theorem, and its application in naive bayes.', 'duration': 449.149, 'highlights': ['The concept of independent events is explained using the example of rolling a dice, where the outcome of one roll does not affect the outcome of the next, illustrating the concept of independent events.', 'The concept of dependent events is illustrated through the example of drawing marbles from a bag, where the probability of subsequent events changes based on the outcome of previous events, clearly demonstrating dependent events.', 'Conditional probability is introduced with the formula for finding the probability of two dependent events occurring, which is the product of the probability of the first event and the probability of the second event given the first has occurred, providing a clear explanation of conditional probability.', 'The discussion progresses to the concept of Bayes theorem, which is derived from the probability of two events occurring in different orders, with a clear demonstration of the derivation and its application.', 'The chapter concludes with the explanation of how Bayes theorem forms the crux behind Naive Bayes, showcasing the practical application of the theory in solving problems, emphasizing the significance of the theorem.']}], 'duration': 503.955, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q4262528.jpg', 'highlights': ['The experiment discussed is rolling a dice, where the probability of each number (1,2,3, etc.) appearing is 1/6.', 'The concept of Bayes theorem is introduced in the context of understanding probabilities in the dice rolling experiment.', 'The chapter concludes with the explanation of how Bayes theorem forms the crux behind Naive Bayes, showcasing the practical application of the theory in solving problems, emphasizing the significance of the theorem.', 'The concept of dependent events is illustrated through the example of drawing marbles from a bag, where the probability of subsequent events changes based on the outcome of previous events, clearly demonstrating dependent events.', 'The concept of independent events is explained using the example of rolling a dice, where the outcome of one roll does not affect the outcome of the next, illustrating the concept of independent events.']}, {'end': 5495.493, 'segs': [{'end': 5026.741, 'src': 'embed', 'start': 4989.053, 'weight': 0, 'content': [{'end': 4990.113, 'text': 'Can I write something like this?', 'start': 4989.053, 'duration': 1.06}, {'end': 4997.377, 'text': 'Probability of Y multiplied by probability of X1, given Y.', 'start': 4990.774, 'duration': 6.603}, {'end': 5006.321, 'text': 'sorry, given Y multiplied by probability of X2 given Y.', 'start': 4997.377, 'duration': 8.944}, {'end': 5009.843, 'text': 'probability of X3 given Y.', 'start': 5006.321, 'duration': 3.522}, {'end': 5012.744, 'text': 'And like this, it will be probability of Xn given Y.', 'start': 5009.843, 'duration': 2.901}, {'end': 5018.13, 'text': 'Okay, so this will also be y1, y2, y3, yn.', 'start': 5013.925, 'duration': 4.205}, {'end': 5026.741, 'text': 'This, I can expand it like this, and then this will basically become probability of x1 multiplied by probability of x2,', 'start': 5019.372, 'duration': 7.369}], 'summary': 'Discussing conditional probability and expansion of probability terms.', 'duration': 37.688, 'max_score': 4989.053, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q4989053.jpg'}, {'end': 5233.511, 'src': 'embed', 'start': 5205.191, 'weight': 2, 'content': [{'end': 5217.458, 'text': 'Divided by probability of X1 multiplied by probability of X2 multiplied by probability of X3 multiplied by probability of X4.', 'start': 5205.191, 'duration': 12.267}, {'end': 5222.069, 'text': 'Right? This is clear? I hope everybody is able to understand.', 'start': 5217.799, 'duration': 4.27}, {'end': 5228.11, 'text': 'We are able to write like this, right? So, y is fixed.', 'start': 5222.089, 'duration': 6.021}, {'end': 5229.83, 'text': 'It may be yes or it may be no.', 'start': 5228.37, 'duration': 1.46}, {'end': 5233.511, 'text': 'But with respect to different records, this value may change, okay?', 'start': 5229.99, 'duration': 3.521}], 'summary': 'Discussing the multiplication of probabilities and the potential changes in the value of y based on different records.', 'duration': 28.32, 'max_score': 5205.191, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5205191.jpg'}, {'end': 5309.404, 'src': 'embed', 'start': 5276.873, 'weight': 3, 'content': [{'end': 5278.534, 'text': 'So I need to find both the probability.', 'start': 5276.873, 'duration': 1.661}, {'end': 5287.28, 'text': 'Right? So probability of x1 multiplied by probability of x2 multiplied by probability of x3 multiplied by probability of x4.', 'start': 5278.795, 'duration': 8.485}, {'end': 5293.763, 'text': 'Right? See, with respect to any x of i, the output can be yes or no.', 'start': 5288.2, 'duration': 5.563}, {'end': 5296.165, 'text': 'And I really need to find out the probability.', 'start': 5294.104, 'duration': 2.061}, {'end': 5297.906, 'text': 'So both the formula is written over here.', 'start': 5296.245, 'duration': 1.661}, {'end': 5300.837, 'text': 'What is the probability with respect to yes?', 'start': 5298.695, 'duration': 2.142}, {'end': 5303.059, 'text': 'And what is the probability with respect to no?', 'start': 5301.197, 'duration': 1.862}, {'end': 5309.404, 'text': 'Okay? Now, in this case, one common thing you see that this denominator is fixed.', 'start': 5304.32, 'duration': 5.084}], 'summary': 'Finding probabilities for x1, x2, x3, x4 outputs yes or no.', 'duration': 32.531, 'max_score': 5276.873, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5276873.jpg'}, {'end': 5464.232, 'src': 'embed', 'start': 5394.81, 'weight': 4, 'content': [{'end': 5399.412, 'text': 'Tell me whether this should be 1 or 0 or which one should be the output in this particular scenario.', 'start': 5394.81, 'duration': 4.602}, {'end': 5408.536, 'text': 'Whether it should be 1 or 0, the output.', 'start': 5405.875, 'duration': 2.661}, {'end': 5411.657, 'text': "Obviously, I'm getting 0.13, 0.05.", 'start': 5409.177, 'duration': 2.48}, {'end': 5414.499, 'text': 'So we do something called as normalization.', 'start': 5411.658, 'duration': 2.841}, {'end': 5419.241, 'text': 'Okay, we will do something called as normalization.', 'start': 5416.98, 'duration': 2.261}, {'end': 5420.721, 'text': 'What is normalization basically saying?', 'start': 5419.281, 'duration': 1.44}, {'end': 5428.364, 'text': 'It says that if I really want to find out the probability of x with x of i, if I do normalization, it is nothing but 0.13,', 'start': 5421.462, 'duration': 6.902}, {'end': 5432.286, 'text': 'divided by 0.13 plus 0.05..', 'start': 5428.364, 'duration': 3.922}, {'end': 5435.967, 'text': 'Now tell me how much this is if I open a calculator.', 'start': 5432.286, 'duration': 3.681}, {'end': 5446.962, 'text': 'Okay, if I open a calculator, so this is 0.13, divided by what it is 0.13 plus 0.05, it is 0.18.', 'start': 5438.948, 'duration': 8.014}, {'end': 5464.232, 'text': 'so if I 0.13 divided by 0.18, so if I go and see, it is nothing but 72, 0.72.', 'start': 5446.962, 'duration': 17.27}], 'summary': 'Normalization calculates probability of x as 0.72 in scenario with 0.13 and 0.05.', 'duration': 69.422, 'max_score': 5394.81, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5394810.jpg'}], 'start': 4766.483, 'title': 'Probability in data science', 'summary': "Covers understanding probability in data science, including the calculation and relationships between probabilities of y and independent features x1, x2, x3 up to xn, as well as the probability computation for binary classification and a specific example showing probabilities of 0.72 for 'yes' and 0.28 for 'no'.", 'chapters': [{'end': 5065.507, 'start': 4766.483, 'title': 'Understanding probability in data science', 'summary': 'Explains the concept of probability in predicting output values based on input features, emphasizing the calculation and relationships between the probabilities of y and independent features x1, x2, x3 up to xn.', 'duration': 299.024, 'highlights': ['Explaining the probability of Y given X1, X2, X3, Xn and the relationship between probabilities of Y and independent features. The chapter extensively discusses the concept of probability of Y given X1, X2, X3, Xn, emphasizing the relationship between the probabilities of Y and independent features.', 'Elaborating on the expansion of probability of Y given X1, X2, X3, Xn into the multiplication of probabilities of X1, X2, X3, up to Xn given Y. It elaborates on the expansion of probability of Y given X1, X2, X3, Xn into the multiplication of probabilities of X1, X2, X3, up to Xn given Y, providing a clear understanding of the relationship between the probabilities.', 'Clarifying the differentiation of Y for each record and its impact on the probability calculations. The chapter clarifies the differentiation of Y for each record and its impact on the probability calculations, emphasizing its significance in predicting output values.']}, {'end': 5495.493, 'start': 5065.567, 'title': 'Probability of binary classification', 'summary': "Explains the probability computation for binary classification, determining the probability of 'yes' or 'no' given a set of features, and the normalization process to obtain the final output, with an example showing a 0.72 probability for 'yes' and 0.28 for 'no'.", 'duration': 429.926, 'highlights': ["The process of computing the probability for binary classification involves determining the probability of 'yes' or 'no' given a set of features, using the formula of probability of yes/no multiplied by the probability of each feature given yes/no, divided by the product of probabilities of all features, and considering the denominator as a constant for the specific problem.", "Normalization is then performed to obtain the final probability values, with an example showing a 0.72 probability for 'yes' and 0.28 for 'no, which determines the final output for binary classification.", "The example demonstrates the normalization process, where the probability of 'yes' given a set of features is obtained by dividing 0.13 by the sum of 0.13 and 0.05, resulting in 0.72, and the probability of 'no' is then calculated as 1 minus 0.72, resulting in 0.28."]}], 'duration': 729.01, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q4766483.jpg', 'highlights': ['The chapter extensively discusses the concept of probability of Y given X1, X2, X3, Xn, emphasizing the relationship between the probabilities of Y and independent features.', 'Elaborating on the expansion of probability of Y given X1, X2, X3, Xn into the multiplication of probabilities of X1, X2, X3, up to Xn given Y, providing a clear understanding of the relationship between the probabilities.', 'The chapter clarifies the differentiation of Y for each record and its impact on the probability calculations, emphasizing its significance in predicting output values.', "The process of computing the probability for binary classification involves determining the probability of 'yes' or 'no' given a set of features, using the formula of probability of yes/no multiplied by the probability of each feature given yes/no, divided by the product of probabilities of all features, and considering the denominator as a constant for the specific problem.", "Normalization is then performed to obtain the final probability values, with an example showing a 0.72 probability for 'yes' and 0.28 for 'no, which determines the final output for binary classification.", "The example demonstrates the normalization process, where the probability of 'yes' given a set of features is obtained by dividing 0.13 by the sum of 0.13 and 0.05, resulting in 0.72, and the probability of 'no' is then calculated as 1 minus 0.72, resulting in 0.28."]}, {'end': 6604.362, 'segs': [{'end': 5659.834, 'src': 'embed', 'start': 5632.158, 'weight': 3, 'content': [{'end': 5641.061, 'text': "Okay Everybody take this data set and I'll try to show you how you can basically solve all these problems, guys.", 'start': 5632.158, 'duration': 8.903}, {'end': 5642.502, 'text': 'I am going to..', 'start': 5641.381, 'duration': 1.121}, {'end': 5643.423, 'text': 'See, I am here.', 'start': 5642.502, 'duration': 0.921}, {'end': 5645.324, 'text': 'Trust me on my teaching.', 'start': 5643.443, 'duration': 1.881}, {'end': 5648.246, 'text': "If you know this much, you'll be able to clear your interviews.", 'start': 5645.584, 'duration': 2.662}, {'end': 5650.088, 'text': 'Okay, later on.', 'start': 5649.267, 'duration': 0.821}, {'end': 5652.89, 'text': "But if you don't want to try it out, it's fine.", 'start': 5650.908, 'duration': 1.982}, {'end': 5654.11, 'text': "It's your life.", 'start': 5653.49, 'duration': 0.62}, {'end': 5659.834, 'text': "Who am I to stop it? I've zoomed in, guys.", 'start': 5654.451, 'duration': 5.383}], 'summary': 'Teaching how to solve problems using data set to clear interviews.', 'duration': 27.676, 'max_score': 5632.158, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5632158.jpg'}, {'end': 5769.879, 'src': 'embed', 'start': 5742.678, 'weight': 4, 'content': [{'end': 5746.2, 'text': 'I will just try to create a smaller table which will give some information.', 'start': 5742.678, 'duration': 3.522}, {'end': 5751.204, 'text': 'Okay Now based on outlook, first of all, try to find out how many categories are there.', 'start': 5746.981, 'duration': 4.223}, {'end': 5759.269, 'text': 'In outlook, one is sunny, one is overcast and one is rain, right? Three categories are there.', 'start': 5752.124, 'duration': 7.145}, {'end': 5761.051, 'text': "So I'm going to write it down over here.", 'start': 5759.75, 'duration': 1.301}, {'end': 5764.433, 'text': 'Sunny Overcast.', 'start': 5761.571, 'duration': 2.862}, {'end': 5768.359, 'text': 'and rain.', 'start': 5766.698, 'duration': 1.661}, {'end': 5769.879, 'text': 'so this three are my features.', 'start': 5768.359, 'duration': 1.52}], 'summary': 'Creating a table with 3 weather categories: sunny, overcast, and rain.', 'duration': 27.201, 'max_score': 5742.678, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5742678.jpg'}, {'end': 6012.756, 'src': 'embed', 'start': 5983.447, 'weight': 0, 'content': [{'end': 5990.109, 'text': 'Now if I say, what is the probability of no given sunny? Okay, given sunny.', 'start': 5983.447, 'duration': 6.662}, {'end': 5996.511, 'text': 'Now see, probability of yes given sunny, probability of yes given forecast, probability of yes given rain.', 'start': 5990.509, 'duration': 6.002}, {'end': 6001.372, 'text': 'So it is basically that I will just try to write it in a simpler manner so that you will not get confused.', 'start': 5996.611, 'duration': 4.761}, {'end': 6005.894, 'text': 'Okay, so this is my probability of yes and this is my probability of no.', 'start': 6002.033, 'duration': 3.861}, {'end': 6012.756, 'text': 'Okay, but understand what does this basically mean? This terminology basically means probability of yes given sunny.', 'start': 6006.334, 'duration': 6.422}], 'summary': 'Discussing conditional probabilities and simplifying terminology.', 'duration': 29.309, 'max_score': 5983.447, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5983447.jpg'}, {'end': 6061.898, 'src': 'embed', 'start': 6035.512, 'weight': 2, 'content': [{'end': 6039.215, 'text': 'I hope you are able to learn it in a very beautiful way.', 'start': 6035.512, 'duration': 3.703}, {'end': 6045.643, 'text': "Okay Now with respect to the next feature, let's consider that I'm going to consider one more feature.", 'start': 6040.516, 'duration': 5.127}, {'end': 6048.906, 'text': "And in this feature, I will say, let's consider temperature.", 'start': 6045.783, 'duration': 3.123}, {'end': 6051.689, 'text': "Okay, let's consider temperature.", 'start': 6048.926, 'duration': 2.763}, {'end': 6061.898, 'text': 'Now in temperature, how many features I have or how many categories I have? I have hot, you can see hot, mild and cold.', 'start': 6052.609, 'duration': 9.289}], 'summary': 'Consider temperature with 3 categories: hot, mild, and cold.', 'duration': 26.386, 'max_score': 6035.512, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q6035512.jpg'}, {'end': 6163.191, 'src': 'embed', 'start': 6133.527, 'weight': 1, 'content': [{'end': 6135.049, 'text': 'Cool, cool or cold?', 'start': 6133.527, 'duration': 1.522}, {'end': 6142.735, 'text': '1 yes, 1 no, 2 yes, 3 yes, 3 yes and 1 no, right?', 'start': 6135.069, 'duration': 7.666}, {'end': 6145.578, 'text': 'So here I have specifically have 3 yes and 1 no.', 'start': 6143.196, 'duration': 2.382}, {'end': 6152.323, 'text': 'Again the total number is 9 and 5 which will be equal to the same thing that we have got.', 'start': 6146.478, 'duration': 5.845}, {'end': 6156.126, 'text': 'Now really go ahead with finding probability of yes given hot.', 'start': 6152.743, 'duration': 3.383}, {'end': 6158.228, 'text': 'So it will be 2 by 9 over here.', 'start': 6156.686, 'duration': 1.542}, {'end': 6161.85, 'text': 'Then here it will be how much? 4 by 9.', 'start': 6159.208, 'duration': 2.642}, {'end': 6163.191, 'text': 'Here it will be 3 by 9 again.', 'start': 6161.85, 'duration': 1.341}], 'summary': "Probability of 'yes' given 'hot' is 2/9, 'cool' is 4/9, and 'cold' is 3/9.", 'duration': 29.664, 'max_score': 6133.527, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q6133527.jpg'}], 'start': 5496.854, 'title': 'Probability calculation in data analysis', 'summary': "Covers analyzing data sets, calculating probabilities for different features, such as weather conditions and temperature categories, with examples and normalization, resulting in probabilities of 'yes' and 'no' instances, with 73% and 27% respectively.", 'chapters': [{'end': 6032.03, 'start': 5496.854, 'title': 'Data set analysis and probability', 'summary': "Discusses analyzing a data set and calculating probabilities based on different features, such as finding the probability of 'yes' and 'no' given different weather conditions in a data set, with a total of 14 'yes' and 5 'no' instances.", 'duration': 535.176, 'highlights': ["The chapter discusses analyzing a data set and calculating probabilities based on different features, such as finding the probability of 'yes' and 'no' given different weather conditions in a data set, with a total of 14 'yes' and 5 'no' instances.", "The instructor emphasizes the importance of understanding the material beyond what's available in test books and encourages students to take the session seriously for their future interviews.", "The instructor shows a famous data set and explains how to extract information from it, specifically focusing on the features and their categories, such as 'sunny', 'overcast', and 'rain', and illustrates how to calculate the probability of 'yes' and 'no' for each category.", "The instructor provides step-by-step explanations and calculations for the probability of 'yes' and 'no' given different weather conditions, demonstrating a clear and structured approach to data analysis.", "The instructor emphasizes the relevance of the material to students' future interviews and encourages them to trust the teaching, highlighting the practical applicability of the knowledge being shared."]}, {'end': 6225.515, 'start': 6035.512, 'title': 'Probability calculation on temperature categories', 'summary': 'Discusses the calculation of probabilities for different temperature categories (hot, mild, cold) based on a dataset, with examples and calculations shown, emphasizing the importance of understanding and appreciating the concepts being taught.', 'duration': 190.003, 'highlights': ['The chapter discusses the calculation of probabilities for different temperature categories (hot, mild, cold) based on a dataset, with examples and calculations shown. The instructor goes through the process of calculating probabilities for hot, mild, and cold categories based on a dataset, providing specific examples and numerical calculations for each category.', 'Emphasizing the importance of understanding and appreciating the concepts being taught. The instructor encourages the audience to appreciate and understand the concepts being taught, urging them to participate and engage with the material for better learning outcomes.', 'The total number of plays are yes is 9, no is 5, and the answer is total 14. The instructor presents the total number of plays and the corresponding probabilities of yes and no, with a clear breakdown of the values for better comprehension.']}, {'end': 6604.362, 'start': 6226.416, 'title': 'Probability calculation in data analysis', 'summary': "Discusses the calculation of probabilities for different weather conditions, with a focus on sunny and hot, and demonstrates the process of normalization to determine the probability of 'yes' and 'no' outcomes, resulting in 73% and 27% respectively.", 'duration': 377.946, 'highlights': ["The process of calculating the probability of 'yes' given sunny and hot, involving the substitution of probabilities and cancellation of constants, resulting in a probability of 0.031, representing 73% after normalization.", "The calculation of the probability of 'no' given sunny and hot, which involves similar substitution and cancellation processes, resulting in a probability of 0.085, representing 27% after normalization."]}], 'duration': 1107.508, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q5496854.jpg', 'highlights': ["The instructor provides step-by-step explanations and calculations for the probability of 'yes' and 'no' given different weather conditions, demonstrating a clear and structured approach to data analysis.", "The process of calculating the probability of 'yes' given sunny and hot, involving the substitution of probabilities and cancellation of constants, resulting in a probability of 0.031, representing 73% after normalization.", 'The chapter discusses the calculation of probabilities for different temperature categories (hot, mild, cold) based on a dataset, with examples and calculations shown.', "The instructor emphasizes the relevance of the material to students' future interviews and encourages them to trust the teaching, highlighting the practical applicability of the knowledge being shared.", "The instructor shows a famous data set and explains how to extract information from it, specifically focusing on the features and their categories, such as 'sunny', 'overcast', and 'rain', and illustrates how to calculate the probability of 'yes' and 'no' for each category."]}, {'end': 7354.886, 'segs': [{'end': 6663.552, 'src': 'embed', 'start': 6604.362, 'weight': 2, 'content': [{'end': 6617.514, 'text': 'So, if the input comes as sunny and hot, if the weather is sunny and hot, what will the person do? Whether he will play or not? The answer is no.', 'start': 6604.362, 'duration': 13.152}, {'end': 6621.638, 'text': 'So, this answer is no.', 'start': 6620.117, 'duration': 1.521}, {'end': 6633.411, 'text': 'Cleaned Right? Solve the problem, guys.', 'start': 6621.658, 'duration': 11.753}, {'end': 6635.072, 'text': 'I solved the problem.', 'start': 6634.171, 'duration': 0.901}, {'end': 6660.049, 'text': 'Okay?, clear, now try to solve another problem.', 'start': 6649.1, 'duration': 10.949}, {'end': 6663.552, 'text': 'pen nahi kharaab karna hai.', 'start': 6660.049, 'duration': 3.503}], 'summary': 'In sunny and hot weather, the person will not play. problem solved.', 'duration': 59.19, 'max_score': 6604.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q6604362.jpg'}, {'end': 6834.431, 'src': 'embed', 'start': 6798.363, 'weight': 0, 'content': [{'end': 6799.563, 'text': 'If you want to divide it, divide it.', 'start': 6798.363, 'duration': 1.2}, {'end': 6809.627, 'text': "Okay let's finish off KNN because again tomorrow I will plan for something else.", 'start': 6804.705, 'duration': 4.922}, {'end': 6811.247, 'text': 'Tomorrow we will do SVM and all.', 'start': 6809.707, 'duration': 1.54}, {'end': 6816.769, 'text': 'So the second algorithm that we are going to discuss about is something called as KNN algorithm.', 'start': 6812.187, 'duration': 4.582}, {'end': 6824.683, 'text': 'KNN algorithm is a very simple problem statement.', 'start': 6821.84, 'duration': 2.843}, {'end': 6833.851, 'text': 'Okay Which can be used to solve both classification and regression problem.', 'start': 6825.623, 'duration': 8.228}, {'end': 6834.431, 'text': '15 minutes guys.', 'start': 6833.871, 'duration': 0.56}], 'summary': 'Introduction to knn algorithm for classification and regression problems, with plans for svm tomorrow.', 'duration': 36.068, 'max_score': 6798.363, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q6798363.jpg'}, {'end': 7137.827, 'src': 'heatmap', 'start': 6911.9, 'weight': 0.753, 'content': [{'end': 6916.263, 'text': 'Okay So what it is going to do, it is going to basically take the five nearest closest point.', 'start': 6911.9, 'duration': 4.363}, {'end': 6919.305, 'text': "Let's say from this, you have two nearest closest point.", 'start': 6916.743, 'duration': 2.562}, {'end': 6922.207, 'text': 'And from here you have three nearest closest point.', 'start': 6919.845, 'duration': 2.362}, {'end': 6928.771, 'text': 'Okay So here we basically see from the distance, the distance that which is my nearest point.', 'start': 6923.207, 'duration': 5.564}, {'end': 6933.274, 'text': 'Now in this particular case, you see that maximum number of points are from red categories.', 'start': 6928.811, 'duration': 4.463}, {'end': 6938.477, 'text': "From red, from red categories, I'm getting three points.", 'start': 6934.715, 'duration': 3.762}, {'end': 6941.299, 'text': "And from white categories, I'm getting two points.", 'start': 6939.057, 'duration': 2.242}, {'end': 6944.689, 'text': 'Two points.', 'start': 6944.209, 'duration': 0.48}, {'end': 6952.935, 'text': 'Okay Now in this particular scenario, maximum number of categories from where it is coming, we basically categorize that into that particular class.', 'start': 6945.09, 'duration': 7.845}, {'end': 6956.157, 'text': 'Okay That is what K nearest neighbor is.', 'start': 6953.856, 'duration': 2.301}, {'end': 6957.458, 'text': 'Very simple.', 'start': 6956.958, 'duration': 0.5}, {'end': 6962.202, 'text': 'Just with the help of distance, which all distance we specifically use, we use two distance.', 'start': 6958.139, 'duration': 4.063}, {'end': 6963.463, 'text': 'One is Euclidean distance.', 'start': 6962.302, 'duration': 1.161}, {'end': 6967.766, 'text': 'And the other one is something called as Manhattan distance.', 'start': 6965.324, 'duration': 2.442}, {'end': 6973.81, 'text': 'Okay So Euclidean and Manhattan distance.', 'start': 6970.988, 'duration': 2.822}, {'end': 6976.101, 'text': 'Rajan, you can go.', 'start': 6975.08, 'duration': 1.021}, {'end': 6978.262, 'text': "See, don't worry if your mess is closed.", 'start': 6976.121, 'duration': 2.141}, {'end': 6979.523, 'text': 'Just go and have your food.', 'start': 6978.282, 'duration': 1.241}, {'end': 6981.965, 'text': 'This you can probably see it after class also.', 'start': 6979.903, 'duration': 2.062}, {'end': 6985.487, 'text': 'Okay So two distances specifically used.', 'start': 6982.825, 'duration': 2.662}, {'end': 6991.811, 'text': 'One is Euclidean distance and the other one is Manhattan distance.', 'start': 6985.607, 'duration': 6.204}, {'end': 6995.331, 'text': 'Okay, Manhattan distance.', 'start': 6994.15, 'duration': 1.181}, {'end': 7003.118, 'text': 'Now what does Euclidean distance basically say? Suppose if this is your two points, which is denoted by x1, y1, x2, y2.', 'start': 6996.412, 'duration': 6.706}, {'end': 7009.641, 'text': 'euclidean distance.', 'start': 7008.82, 'duration': 0.821}, {'end': 7019.087, 'text': 'in order to calculate, we apply a formula which looks like this x2 minus x1 whole square, plus y2 minus y1 whole square okay,', 'start': 7009.641, 'duration': 9.446}, {'end': 7023.771, 'text': 'whereas in the case of manhattan distance suppose these are my two points.', 'start': 7019.087, 'duration': 4.684}, {'end': 7025.752, 'text': 'then we calculate the distance in this way.', 'start': 7023.771, 'duration': 1.981}, {'end': 7027.653, 'text': 'we calculate the distance from here.', 'start': 7025.752, 'duration': 1.901}, {'end': 7030.435, 'text': 'then here, right, this is the distance we calculate.', 'start': 7027.653, 'duration': 2.782}, {'end': 7033.037, 'text': "we don't calculate the hypotenuse distance.", 'start': 7030.435, 'duration': 2.602}, {'end': 7036.64, 'text': 'okay, so this is the basic difference between euclidean and manhattan distance.', 'start': 7033.037, 'duration': 3.603}, {'end': 7041.321, 'text': 'Okay, now, you may be thinking Krish, then fine, that is for classification problem.', 'start': 7037.259, 'duration': 4.062}, {'end': 7047.325, 'text': 'For regression, what do we do? See, for regression, what do we do? For regression also, it is very much simple.', 'start': 7041.341, 'duration': 5.984}, {'end': 7050.807, 'text': 'Suppose I have all the data points which looks like this.', 'start': 7048.506, 'duration': 2.301}, {'end': 7057.778, 'text': 'okay, now for a new data point like this.', 'start': 7054.716, 'duration': 3.062}, {'end': 7061.6, 'text': 'if i want to calculate, then how do i calculate?', 'start': 7057.778, 'duration': 3.822}, {'end': 7063.962, 'text': 'then we basically take up the nearest five points.', 'start': 7061.6, 'duration': 2.362}, {'end': 7065.503, 'text': "let's say my k is five.", 'start': 7063.962, 'duration': 1.541}, {'end': 7069.506, 'text': 'k is a hyper parameter, which we play.', 'start': 7065.503, 'duration': 4.003}, {'end': 7072.227, 'text': 'k is a hyper parameter.', 'start': 7069.506, 'duration': 2.721}, {'end': 7080.508, 'text': "okay, now, suppose let's say that k, it finds the nearest point over here, here, here, here and here.", 'start': 7072.227, 'duration': 8.281}, {'end': 7086.454, 'text': 'so if we need to find out the point for this particular output with respect to the k is equal to 5,', 'start': 7080.508, 'duration': 5.946}, {'end': 7090.138, 'text': 'it will try to calculate the average of all the points.', 'start': 7086.454, 'duration': 3.684}, {'end': 7096.023, 'text': 'once it calculates the average of all the points, that becomes your output.', 'start': 7090.138, 'duration': 5.885}, {'end': 7099.607, 'text': 'okay. so regression and classification, that is the only difference.', 'start': 7096.023, 'duration': 3.584}, {'end': 7109.022, 'text': 'okay. so probably in the next class i will try to show you practically how to implement this, because this k is actually an hyperparameter.', 'start': 7100.515, 'duration': 8.507}, {'end': 7119.391, 'text': 'we try with k is equal to 1 to 50, and then we probably try to check the error rate and if the error rate is less, then only we select the model.', 'start': 7109.022, 'duration': 10.369}, {'end': 7128.539, 'text': 'okay. so this is regarding the k, nearest neighbor.', 'start': 7119.391, 'duration': 9.148}, {'end': 7133.444, 'text': 'okay. Now two more things with respect to k-nearest neighbor.', 'start': 7128.539, 'duration': 4.905}, {'end': 7137.827, 'text': 'k-nearest neighbor works very bad with respect to two things.', 'start': 7133.684, 'duration': 4.143}], 'summary': 'K-nearest neighbor algorithm uses euclidean and manhattan distances for classification and regression, and requires careful selection of the hyperparameter k.', 'duration': 225.927, 'max_score': 6911.9, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q6911900.jpg'}, {'end': 7099.607, 'src': 'embed', 'start': 7069.506, 'weight': 1, 'content': [{'end': 7072.227, 'text': 'k is a hyper parameter.', 'start': 7069.506, 'duration': 2.721}, {'end': 7080.508, 'text': "okay, now, suppose let's say that k, it finds the nearest point over here, here, here, here and here.", 'start': 7072.227, 'duration': 8.281}, {'end': 7086.454, 'text': 'so if we need to find out the point for this particular output with respect to the k is equal to 5,', 'start': 7080.508, 'duration': 5.946}, {'end': 7090.138, 'text': 'it will try to calculate the average of all the points.', 'start': 7086.454, 'duration': 3.684}, {'end': 7096.023, 'text': 'once it calculates the average of all the points, that becomes your output.', 'start': 7090.138, 'duration': 5.885}, {'end': 7099.607, 'text': 'okay. so regression and classification, that is the only difference.', 'start': 7096.023, 'duration': 3.584}], 'summary': 'K is a hyperparameter for finding nearest points, used for regression and classification.', 'duration': 30.101, 'max_score': 7069.506, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q7069506.jpg'}, {'end': 7276.193, 'src': 'embed', 'start': 7221.246, 'weight': 3, 'content': [{'end': 7222.527, 'text': 'It gets plotted somewhere right?', 'start': 7221.246, 'duration': 1.281}, {'end': 7226.95, 'text': 'Okay?, Okay?', 'start': 7222.547, 'duration': 4.403}, {'end': 7229.824, 'text': 'Formula for Manhattan distance.', 'start': 7228.223, 'duration': 1.601}, {'end': 7245.63, 'text': 'it uses modulus x2 minus x1 plus y2 minus y1, modulus mode mode, so not modulus mode x2 minus x1, y2 minus y1..', 'start': 7229.824, 'duration': 15.806}, {'end': 7269.786, 'text': 'Okay, guys.', 'start': 7269.366, 'duration': 0.42}, {'end': 7273.79, 'text': 'So I hope you like this session.', 'start': 7272.289, 'duration': 1.501}, {'end': 7276.193, 'text': 'How many of you like this session?', 'start': 7274.771, 'duration': 1.422}], 'summary': 'Explained manhattan distance formula, seeking feedback.', 'duration': 54.947, 'max_score': 7221.246, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q7221246.jpg'}], 'start': 6604.362, 'title': 'Probability and knn algorithm', 'summary': 'Discusses the probability of playing based on weather conditions and provides an overview of the knn algorithm, including its applications, distances used, and limitations with outliers and imbalanced datasets.', 'chapters': [{'end': 6755.742, 'start': 6604.362, 'title': 'Probability and weather conditions', 'summary': 'Discusses the probability of playing based on weather conditions, introducing a new scenario and assigning an exercise, and concludes with a request for feedback on the session.', 'duration': 151.38, 'highlights': ['The chapter discusses the probability of playing based on weather conditions, introducing a new scenario and assigning an exercise.', 'The answer is no for playing when the weather is sunny and hot.', 'The speaker assigns an exercise to calculate the probability using nape bias with new data of mild and overcast weather.', 'The speaker requests feedback on the session and encourages the audience to hit like if they found it useful.']}, {'end': 7354.886, 'start': 6756.923, 'title': 'Knn algorithm overview', 'summary': 'Provides an overview of the knn algorithm, explaining its application in both classification and regression problems, using euclidean and manhattan distances, and discussing its limitations with outliers and imbalanced datasets.', 'duration': 597.963, 'highlights': ['KNN algorithm can be used for both classification and regression problems, with a discussion on the application of K nearest neighbor and the concept of using k value for classification and regression. The KNN algorithm is applicable to both classification and regression problems, utilizing the concept of K nearest neighbor and the use of the k value for classification and regression.', 'Explanation of using Euclidean and Manhattan distances for determining the nearest points in the KNN algorithm. The explanation of utilizing Euclidean and Manhattan distances for determining the nearest points in the KNN algorithm.', 'Discussion on the limitations of KNN algorithm with outliers and imbalanced datasets. An overview of the limitations of the KNN algorithm with outliers and imbalanced datasets.']}], 'duration': 750.524, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/TelJFE7bx0Q/pics/TelJFE7bx0Q6604362.jpg', 'highlights': ['The KNN algorithm is applicable to both classification and regression problems, utilizing the concept of K nearest neighbor and the use of the k value for classification and regression.', 'The chapter discusses the probability of playing based on weather conditions, introducing a new scenario and assigning an exercise.', 'Explanation of using Euclidean and Manhattan distances for determining the nearest points in the KNN algorithm.', 'The answer is no for playing when the weather is sunny and hot.', 'Discussion on the limitations of KNN algorithm with outliers and imbalanced datasets.']}], 'highlights': ['The best score achieved using Grid Search Cross Validation with logistic regression is 95% with max iteration set to 150.', "The model's accuracy score is determined to be 96%.", "The chapter emphasizes the importance of precision, recall, and F1 score in evaluating the model's performance for the balanced data set.", 'Applying class weights is crucial for handling imbalanced datasets in logistic regression.', 'The parameters l1 and l2 norms in logistic regression are important, with l2 being the inverse of regularization strength and l1 being used for regularization.', 'The process involves importing logistic regression from sklearn.linear_model and utilizing the breast cancer dataset for classification, with independent features being determined as input for determining the presence of cancer.', 'The session emphasizes audience engagement, setting a target of 200 likes before commencing the agenda.', 'The live session covers machine learning topics, including linear regression, logistic regression, probability concepts, and knn algorithm, aiming for 200 likes and emphasizing engaging the audience.', 'The chapter covered ridge, lasso, and logistic regression in two sessions, with plans for hyperparameter tuning and practicals.', 'The chapter delves into linear regression as a topic of discussion in the previous session.', 'Emphasis on practical problem-solving using linear regression, ridge, and lasso, aiming for better understanding.', 'The instructor emphasizes the preparation for a seven-day live session on linear regression, ridge, and lasso, and the use of different libraries for linear regression.', 'Importing the Boston house pricing data set from sklearn is discussed, including the initialization of the data set and the use of necessary libraries such as numpy, pandas, seaborn, and matplotlib.pyplot.', 'Insights into the house pricing dataset are provided, including details about the 12 features and their meanings.', 'The chapter demonstrates the process of ridge regression and hyperparameter tuning through grid search CV.', 'The chapter emphasizes the importance of libraries and hyperparameter tuning when implementing linear regression, showcasing the process of cross validation with an average mean squared error score of 37.13 after 5-fold cross validation.', "The chapter demonstrates the creation of a new feature, 'price', representing the price of houses, and assigns values to this target, which is in the form of an array.", 'The process involves playing with specific parameters such as 1, 5, 10, 20 to identify the best fit parameter for the model in GRITS or CV.', 'Applying grid search CV to select the best fit parameter for the Ridge regressor model using parameters and scoring metrics, such as negative mean squared error, to optimize model performance.', 'Adding more parameters results in an improved performance with Ridge regression, achieving a minimum error of -29.', 'Using lasso regressor to predict y test values and obtaining an R2 score of 0.67.', 'Comparing different regressors such as ridge regressor and linear regressor, with R2 score around 68 percent.', 'Highlighting the limitations of linear regression and the need to explore other algorithms like xg boost and logistic regression.', 'The experiment discussed is rolling a dice, where the probability of each number (1,2,3, etc.) appearing is 1/6.', 'The concept of Bayes theorem is introduced in the context of understanding probabilities in the dice rolling experiment.', 'The chapter concludes with the explanation of how Bayes theorem forms the crux behind Naive Bayes, showcasing the practical application of the theory in solving problems, emphasizing the significance of the theorem.', 'The concept of dependent events is illustrated through the example of drawing marbles from a bag, where the probability of subsequent events changes based on the outcome of previous events, clearly demonstrating dependent events.', 'The concept of independent events is explained using the example of rolling a dice, where the outcome of one roll does not affect the outcome of the next, illustrating the concept of independent events.', 'The chapter extensively discusses the concept of probability of Y given X1, X2, X3, Xn, emphasizing the relationship between the probabilities of Y and independent features.', 'Elaborating on the expansion of probability of Y given X1, X2, X3, Xn into the multiplication of probabilities of X1, X2, X3, up to Xn given Y, providing a clear understanding of the relationship between the probabilities.', 'The chapter clarifies the differentiation of Y for each record and its impact on the probability calculations, emphasizing its significance in predicting output values.', "The process of computing the probability for binary classification involves determining the probability of 'yes' or 'no' given a set of features, using the formula of probability of yes/no multiplied by the probability of each feature given yes/no, divided by the product of probabilities of all features, and considering the denominator as a constant for the specific problem.", "Normalization is then performed to obtain the final probability values, with an example showing a 0.72 probability for 'yes' and 0.28 for 'no, which determines the final output for binary classification.", "The example demonstrates the normalization process, where the probability of 'yes' given a set of features is obtained by dividing 0.13 by the sum of 0.13 and 0.05, resulting in 0.72, and the probability of 'no' is then calculated as 1 minus 0.72, resulting in 0.28.", "The instructor provides step-by-step explanations and calculations for the probability of 'yes' and 'no' given different weather conditions, demonstrating a clear and structured approach to data analysis.", "The process of calculating the probability of 'yes' given sunny and hot, involving the substitution of probabilities and cancellation of constants, resulting in a probability of 0.031, representing 73% after normalization.", 'The chapter discusses the calculation of probabilities for different temperature categories (hot, mild, cold) based on a dataset, with examples and calculations shown.', "The instructor emphasizes the relevance of the material to students' future interviews and encourages them to trust the teaching, highlighting the practical applicability of the knowledge being shared.", "The instructor shows a famous data set and explains how to extract information from it, specifically focusing on the features and their categories, such as 'sunny', 'overcast', and 'rain', and illustrates how to calculate the probability of 'yes' and 'no' for each category.", 'The KNN algorithm is applicable to both classification and regression problems, utilizing the concept of K nearest neighbor and the use of the k value for classification and regression.', 'The chapter discusses the probability of playing based on weather conditions, introducing a new scenario and assigning an exercise.', 'Explanation of using Euclidean and Manhattan distances for determining the nearest points in the KNN algorithm.', 'The answer is no for playing when the weather is sunny and hot.', 'Discussion on the limitations of KNN algorithm with outliers and imbalanced datasets.']}