title
Data Analytics Crash Course: Teach Yourself in 30 Days

description
The course is an introduction to Python-based data analytics. You will get a basic understanding of the workings of Python to the point where you can confidently find and manipulate data sources and use a Jupyter environment to derive insights from your data. 🔗 Course Website: https://stories.thedataproject.net/ 💻 Code: https://github.com/dbclinton/jupyter_data ✏️ Course developed by David Clinton. ⭐️ Contents ⭐️ ⌨️ (06:50) Installing Python and Jupyter ⌨️ (09:35) Working with the Jupyter environment ⌨️ (12:05) Finding data sources and using APIs ⌨️ (16:35) Working with data ⌨️ (24:45) Plotting data ⌨️ (32:45) Understanding data 🎉 Thanks to our Champion and Sponsor supporters: 👾 Wong Voon jinq 👾 hexploitation 👾 Katia Moran 👾 BlckPhantom 👾 Nick Raker 👾 Otis Morgan 👾 DeezMaster 👾 Treehouse -- Learn to code for free and get a developer job: https://www.freecodecamp.org Read hundreds of articles on programming: https://freecodecamp.org/news

detail
{'title': 'Data Analytics Crash Course: Teach Yourself in 30 Days', 'heatmap': [{'end': 920.789, 'start': 893.635, 'weight': 0.725}, {'end': 990.388, 'start': 937.597, 'weight': 0.774}, {'end': 1130.524, 'start': 1102.193, 'weight': 1}], 'summary': "The 'data analytics crash course: teach yourself in 30 days' video covers a 30-day course by david clinton, emphasizing hands-on experience and tools for data manipulation. it discusses various analytics tools, python setup, data analysis, accessing bls api, and analyzing cpi and wage data, highlighting techniques for data visualization and analytics overview.", 'chapters': [{'end': 162.392, 'segs': [{'end': 33.464, 'src': 'embed', 'start': 0.149, 'weight': 0, 'content': [{'end': 5.694, 'text': 'David Clinton has written and created many popular technical books and video courses.', 'start': 0.149, 'duration': 5.545}, {'end': 14.3, 'text': 'This data analytics course, along with the accompanying website and Jupyter notebooks will help you learn data analytics in 30 days.', 'start': 6.054, 'duration': 8.246}, {'end': 15.962, 'text': 'Welcome to my course.', 'start': 14.941, 'duration': 1.021}, {'end': 18.043, 'text': "I'm really glad to have you here.", 'start': 16.361, 'duration': 1.682}, {'end': 23.147, 'text': "And I'm even happier that you've decided to join the data analytics party.", 'start': 18.644, 'duration': 4.503}, {'end': 33.464, 'text': "Who am I? I'm the author of more than a dozen books on Linux and AWS administration, digital security, and dozens of courses on Pluralsight.", 'start': 23.928, 'duration': 9.536}], 'summary': 'David clinton offers a 30-day data analytics course with books and video courses.', 'duration': 33.315, 'max_score': 0.149, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo149.jpg'}, {'end': 71.632, 'src': 'embed', 'start': 47.585, 'weight': 1, 'content': [{'end': 54.606, 'text': "Since you've already seen my claim that this will only take you 30 days, I should explain what this actually is.", 'start': 47.585, 'duration': 7.021}, {'end': 63.808, 'text': "I'm going to show you the tools you'll need to find and manipulate raw data and use various graphing tools to help you understand and interpret it.", 'start': 55.206, 'duration': 8.602}, {'end': 71.632, 'text': "But don't expect us to cover a full data science curriculum here, complete with single and multivariable calculus,", 'start': 64.507, 'duration': 7.125}], 'summary': 'Learn data manipulation and graphing tools in 30 days.', 'duration': 24.047, 'max_score': 47.585, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo47585.jpg'}], 'start': 0.149, 'title': 'A 30-day data analytics course', 'summary': 'Introduces a 30-day data analytics course created by david clinton, emphasizing the need for hands-on experience and offering free resources for further learning. it covers tools for finding and manipulating raw data, providing a comprehensive learning experience.', 'chapters': [{'end': 162.392, 'start': 0.149, 'title': '30-day data analytics course', 'summary': 'Introduces a 30-day data analytics course created by david clinton, covering tools for finding and manipulating raw data, and emphasizes the need for hands-on experience and offers free resources for further learning.', 'duration': 162.243, 'highlights': ['David Clinton, author of technical books and video courses, introduces a 30-day data analytics course to learn data analytics in 30 days.', 'The course provides tools for finding and manipulating raw data and using various graphing tools to interpret it, emphasizing the need for hands-on experience for real learning.', 'David Clinton offers free resources including Jupyter notebooks, a website, and exercises to further enhance learning, with the option to purchase the content in book format.']}], 'duration': 162.243, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo149.jpg', 'highlights': ['David Clinton introduces a 30-day data analytics course to learn data analytics in 30 days.', 'The course provides tools for finding and manipulating raw data and using various graphing tools to interpret it, emphasizing the need for hands-on experience for real learning.', 'David Clinton offers free resources including Jupyter notebooks, a website, and exercises to further enhance learning, with the option to purchase the content in book format.']}, {'end': 397.154, 'segs': [{'end': 212.168, 'src': 'embed', 'start': 163.033, 'weight': 0, 'content': [{'end': 166.597, 'text': "But right now, let's talk about data analytics tools.", 'start': 163.033, 'duration': 3.564}, {'end': 170.768, 'text': 'There are many ways to consume data.', 'start': 167.967, 'duration': 2.801}, {'end': 176.39, 'text': 'The one you choose will reflect your specific needs and your comfort with various skills.', 'start': 171.388, 'duration': 5.002}, {'end': 185.353, 'text': 'Spreadsheets, as you probably already know, are much more than just fancy calculators or places to keep your household budget numbers.', 'start': 177.23, 'duration': 8.123}, {'end': 191.515, 'text': 'They also come with powerful functions, external integrations, and graphing capabilities.', 'start': 186.033, 'duration': 5.482}, {'end': 201.501, 'text': "Enterprise strength tools like Tableau, Splunk or Microsoft's Power BI are also great for crunching numbers and visualizing insights,", 'start': 192.515, 'duration': 8.986}, {'end': 203.542, 'text': 'which you can then share with your team members.', 'start': 201.501, 'duration': 2.041}, {'end': 212.168, 'text': "So then what's the big deal with Python? Well, the Python ecosystem is much, much broader than those purpose built tools.", 'start': 204.503, 'duration': 7.665}], 'summary': 'Data analytics tools include spreadsheets, tableau, splunk, power bi, and python, each offering unique functions and insights.', 'duration': 49.135, 'max_score': 163.033, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo163033.jpg'}, {'end': 260.923, 'src': 'embed', 'start': 235.33, 'weight': 2, 'content': [{'end': 244.195, 'text': 'Okay, but what about Jupiter? Jupiter is an open source platform within which you can load your data and execute your Python code.', 'start': 235.33, 'duration': 8.865}, {'end': 248.997, 'text': "It's a lot like a programming IDE like Microsoft's Visual Studio.", 'start': 244.875, 'duration': 4.122}, {'end': 256.44, 'text': 'And while Jupiter notebooks can be used with a growing number of languages and for as many tasks as you can imagine,', 'start': 249.498, 'duration': 6.942}, {'end': 260.923, 'text': "it's best known and loved as a host for Python data heroics.", 'start': 256.44, 'duration': 4.483}], 'summary': 'Jupiter is an open source platform for executing python code, similar to an ide like visual studio, best known for python data heroics.', 'duration': 25.593, 'max_score': 235.33, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo235330.jpg'}, {'end': 372.895, 'src': 'embed', 'start': 284.724, 'weight': 4, 'content': [{'end': 288.548, 'text': "That would make it harder to troubleshoot when something didn't go according to spec.", 'start': 284.724, 'duration': 3.824}, {'end': 294.876, 'text': 'But it would also make it a lot harder to play around with specific details just to see what happens.', 'start': 289.189, 'duration': 5.687}, {'end': 300.062, 'text': 'And it also made it tough to share live versions of your code across the internet.', 'start': 295.597, 'duration': 4.465}, {'end': 307.446, 'text': "As we'll soon see, Jupyter notebooks lets you run your code a single line at a time, or all together.", 'start': 300.843, 'duration': 6.603}, {'end': 314.49, 'text': 'That flexibility makes it easier to understand your code, and when things go wrong, to troubleshoot it.', 'start': 308.127, 'duration': 6.363}, {'end': 316.671, 'text': 'Notebooks, by the way,', 'start': 315.19, 'duration': 1.481}, {'end': 328.257, 'text': 'are JSON based files that effectively move the processing environment for just about any data oriented programming code from your server or workstation to your web browser.', 'start': 316.671, 'duration': 11.586}, {'end': 336.48, 'text': 'You can download Jupiter to your PC or a private server and access the interface through any browser with network access.', 'start': 328.897, 'duration': 7.583}, {'end': 344.143, 'text': "Or you can run notebooks on third party hosting services like Google's co laboratory or, for a cost,", 'start': 337.04, 'duration': 7.103}, {'end': 349.945, 'text': "cloud providers like Amazon's SageMaker Studio notebooks or Microsoft's Azure notebook.", 'start': 344.143, 'duration': 5.802}, {'end': 352.166, 'text': 'Jupiter comes in three flavors.', 'start': 350.365, 'duration': 1.801}, {'end': 357.648, 'text': "The two you're most likely to encounter are classic notebooks and the newer Jupiter lab.", 'start': 352.666, 'duration': 4.982}, {'end': 360.329, 'text': 'Both run nicely within your browser.', 'start': 358.228, 'duration': 2.101}, {'end': 368.733, 'text': 'But Jupyter lab comes with more extensions and lets you work with multiple notebook files and terminal access within a single browser tab.', 'start': 360.449, 'duration': 8.284}, {'end': 372.895, 'text': "I'll be using the classic notebook environment for the demos in this course.", 'start': 369.253, 'duration': 3.642}], 'summary': 'Jupyter notebooks enable running code flexibly, sharing live versions online, and come in classic and jupyter lab flavors.', 'duration': 88.171, 'max_score': 284.724, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo284724.jpg'}], 'start': 163.033, 'title': 'Data analytics tools and jupyter notebooks', 'summary': 'Discusses various data analytics tools such as spreadsheets, tableau, splunk, power bi, python, and jupyter, emphasizing their capabilities. it also highlights the benefits of jupyter notebooks for efficient code execution, troubleshooting, and collaboration across platforms, serving multiple users.', 'chapters': [{'end': 260.923, 'start': 163.033, 'title': 'Data analytics tools overview', 'summary': "Discusses various data analytics tools, including spreadsheets, enterprise strength tools like tableau, splunk, and microsoft's power bi, python, and jupyter, highlighting their capabilities and benefits for data analysis and visualization.", 'duration': 97.89, 'highlights': ['Python ecosystem offers a broader range of data-specific libraries and modules compared to purpose-built tools like Tableau and Splunk, providing extensive resources for industrial-strength data analysis.', 'Spreadsheets have powerful functions, external integrations, and graphing capabilities beyond basic calculations, making them versatile tools for data consumption and analysis.', 'Jupyter serves as an open source platform for executing Python code and is renowned for its effectiveness in Python data analysis and visualization, functioning as a versatile programming IDE for various tasks.', "Enterprise strength tools like Tableau, Splunk, and Microsoft's Power BI are ideal for crunching numbers and visualizing insights, allowing seamless collaboration and sharing of findings with team members."]}, {'end': 397.154, 'start': 261.684, 'title': 'Jupyter notebooks for efficient code execution and collaboration', 'summary': 'Highlights the benefits of jupyter notebooks in allowing flexible code execution, easier troubleshooting, and collaboration through its json-based files, accessible from various platforms, including third-party hosting services, with the capability to serve multiple users.', 'duration': 135.47, 'highlights': ['The flexibility of Jupyter notebooks allows running code line by line or all together, making it easier to understand and troubleshoot code, as well as providing the capability to share live versions of code across the internet.', 'Jupyter notebooks are JSON-based files that move the processing environment for any data-oriented programming code from the server or workstation to the web browser, facilitating easier code understanding and troubleshooting.', 'Jupyter comes in three flavors: classic notebooks, Jupyter lab with more extensions and terminal access, and Jupyter hub for authenticated notebook access to multiple users, with the capability to serve up to around 100 users from a single cloud server.', "You can download Jupyter to your PC or a private server and access the interface through any browser with network access, or run notebooks on third-party hosting services like Google's colab or cloud providers like Amazon's SageMaker Studio notebooks or Microsoft's Azure notebook."]}], 'duration': 234.121, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo163033.jpg', 'highlights': ['Python ecosystem offers extensive resources for industrial-strength data analysis', 'Spreadsheets have powerful functions, external integrations, and graphing capabilities', 'Jupyter serves as an open source platform for executing Python code and is renowned for its effectiveness in Python data analysis and visualization', "Enterprise strength tools like Tableau, Splunk, and Microsoft's Power BI are ideal for crunching numbers and visualizing insights", 'The flexibility of Jupyter notebooks allows running code line by line or all together, making it easier to understand and troubleshoot code', 'Jupyter notebooks are JSON-based files that move the processing environment for any data-oriented programming code from the server or workstation to the web browser', 'Jupyter comes in three flavors: classic notebooks, Jupyter lab with more extensions and terminal access, and Jupyter hub for authenticated notebook access to multiple users', 'You can download Jupyter to your PC or a private server and access the interface through any browser with network access']}, {'end': 875.188, 'segs': [{'end': 491.598, 'src': 'embed', 'start': 466.871, 'weight': 1, 'content': [{'end': 474.461, 'text': "You may, for instance, find that you need a library written for version 3.9, but that there's no way to get it working on your 3.7 system.", 'start': 466.871, 'duration': 7.59}, {'end': 480.41, 'text': 'Upgrading your system version to 3.9 might work out well for you.', 'start': 476.647, 'duration': 3.763}, {'end': 485.233, 'text': 'But it could also cause some unexpected and unpleasant consequences.', 'start': 481.09, 'duration': 4.143}, {'end': 491.598, 'text': "It's hard to know when a particular Python library might also be needed by your core operating system.", 'start': 485.774, 'duration': 5.824}], 'summary': 'Upgrading to python 3.9 may be beneficial but could have unexpected consequences.', 'duration': 24.727, 'max_score': 466.871, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo466871.jpg'}, {'end': 580.312, 'src': 'embed', 'start': 534.085, 'weight': 0, 'content': [{'end': 540.847, 'text': 'you want to read the official documentation for the virtual environment instructions specific to your host OS,', 'start': 534.085, 'duration': 6.762}, {'end': 542.527, 'text': 'whichever version of Jupyter you choose.', 'start': 540.847, 'duration': 1.68}, {'end': 545.828, 'text': 'If you decide to install and run it locally,', 'start': 543.027, 'duration': 2.801}, {'end': 554.11, 'text': 'the Jupyter project officially recommends doing it through the Python and a conda distribution and its binary package manager, conda.', 'start': 545.828, 'duration': 8.282}, {'end': 558.251, 'text': 'various guides to doing that are available for various OS hosts.', 'start': 554.85, 'duration': 3.401}, {'end': 561.032, 'text': 'But this official page is a good place to start.', 'start': 558.771, 'duration': 2.261}, {'end': 565.813, 'text': 'As you can see, though, the Python pip package manager is also an option.', 'start': 561.492, 'duration': 4.321}, {'end': 572.015, 'text': "Once all that's done, you should be able to open a notebook right in your browser and get right down to work.", 'start': 566.573, 'duration': 5.442}, {'end': 580.312, 'text': 'For me, a notebooks most powerful feature is the way you can run subsets of your code within individual cells.', 'start': 573.43, 'duration': 6.882}], 'summary': 'For jupyter installation, use python and conda, with guides for various os; pip is also an option. once installed, run code subsets in browser notebooks.', 'duration': 46.227, 'max_score': 534.085, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo534085.jpg'}, {'end': 807.655, 'src': 'embed', 'start': 776.963, 'weight': 3, 'content': [{'end': 785.505, 'text': "To answer those questions, we're going to access to data sets collected and maintained by the US government's Bureau of Labor Statistics.", 'start': 776.963, 'duration': 8.542}, {'end': 792.625, 'text': 'One of the many nice things about the Bureau of Labor Statistics, usually referred to as BLS,', 'start': 786.621, 'duration': 6.004}, {'end': 797.048, 'text': 'is that they provide an API for access from within our Python scripts.', 'start': 792.625, 'duration': 4.423}, {'end': 807.655, 'text': "To make that work, you'll need to know the BLS endpoint address matching the specific data series you need the Python code to initiate the request.", 'start': 797.748, 'duration': 9.907}], 'summary': 'Access us government data via bls api for python scripts', 'duration': 30.692, 'max_score': 776.963, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo776963.jpg'}], 'start': 397.154, 'title': 'Python setup and data analysis', 'summary': 'Discusses setting up python for jupyter, emphasizing the use of python 3 and potential issues with library compatibility. it also covers data analysis with python and jupyter notebooks, including accessing public apis for retrieving and analyzing data on wages and standard of living over the past 20 years from us government data.', 'chapters': [{'end': 558.251, 'start': 397.154, 'title': 'Setting up python for jupyter', 'summary': 'Discusses the process of setting up a python environment for jupyter, emphasizing the importance of using python 3 and the potential issues with library compatibility and system damage. it also outlines the use of virtual environments to isolate python projects and recommends using conda for jupyter installation.', 'duration': 161.097, 'highlights': ['The chapter discusses the process of setting up a Python environment for Jupyter, emphasizing the importance of using Python 3 and the potential issues with library compatibility and system damage.', 'It also outlines the use of virtual environments to isolate Python projects and recommends using conda for Jupyter installation.', 'If you decide to install and run it locally, the Jupyter project officially recommends doing it through the Python and a conda distribution and its binary package manager, conda.']}, {'end': 875.188, 'start': 558.771, 'title': 'Data analysis with python and jupiter', 'summary': 'Covers the basics of using python and jupiter notebooks for data analysis, including running code in cells, formatting cells for code or markdown, and accessing public apis for data retrieval and analysis, with a focus on analyzing wages and standard of living over the past 20 years from us government data.', 'duration': 316.417, 'highlights': ['Jupiter notebooks most powerful feature is running subsets of code within individual cells, making it easier to break down long and complex programs into easily readable and executable snippets.', 'Python code creates values that remain in the kernel memory until the output from a particular cell or for the entire kernel are cleared, allowing for re-running previous or subsequent cells to see the impact of changes.', 'Accessing public APIs for data retrieval and analysis, with a focus on analyzing wages and standard of living over the past 20 years from US government data, using the Bureau of Labor Statistics API.']}], 'duration': 478.034, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo397154.jpg', 'highlights': ["Jupyter notebooks' powerful feature is running subsets of code within individual cells", 'Setting up Python environment for Jupyter emphasizes using Python 3 and potential library compatibility issues', 'Outlines use of virtual environments to isolate Python projects and recommends using conda for Jupyter installation', 'Accessing public APIs for data retrieval and analysis, focusing on analyzing wages and standard of living over the past 20 years from US government data']}, {'end': 1254.47, 'segs': [{'end': 920.789, 'src': 'heatmap', 'start': 875.188, 'weight': 2, 'content': [{'end': 883.651, 'text': 'So how do you turn those series IDs into Python friendly data, manually writing get and put requests can be very picky.', 'start': 875.188, 'duration': 8.463}, {'end': 886.933, 'text': "And it'll take a lot of tries before you get it exactly right.", 'start': 884.091, 'duration': 2.842}, {'end': 893.215, 'text': 'To avoid all that, I decided to go with a third party Python library called bls.', 'start': 887.993, 'duration': 5.222}, {'end': 901.818, 'text': "That's available through Oliver sheroes is GitHub repo, you install the library on your host machine using pip install bls.", 'start': 893.635, 'duration': 8.183}, {'end': 903.159, 'text': "That's all it'll take.", 'start': 902.419, 'duration': 0.74}, {'end': 908.865, 'text': "While we're here, we might as well activate our BLS API key.", 'start': 904.824, 'duration': 4.041}, {'end': 916.467, 'text': "you register for the API from this page and they'll send you an email with your key and a validation URL that you'll need to click.", 'start': 908.865, 'duration': 7.602}, {'end': 920.789, 'text': "Once you've got your key, you export it to your system environment.", 'start': 917.068, 'duration': 3.721}], 'summary': 'Use the bls python library to handle bls api requests and activate the api key for seamless data retrieval.', 'duration': 45.601, 'max_score': 875.188, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo875188.jpg'}, {'end': 994.088, 'src': 'heatmap', 'start': 937.597, 'weight': 0, 'content': [{'end': 943.321, 'text': 'The CPI is a measure of the price of a basket of essential consumer goods.', 'start': 937.597, 'duration': 5.724}, {'end': 950.427, 'text': "It's an important proxy for changes in the cost of living, which in turn is an indicator of the general health of the economy.", 'start': 943.842, 'duration': 6.585}, {'end': 955.63, 'text': 'Our wages data will come from the BLS Employment Cost Index,', 'start': 951.388, 'duration': 4.242}, {'end': 961.212, 'text': 'covering wages and salaries for private industry workers in all industries and occupations.', 'start': 955.63, 'duration': 5.582}, {'end': 966.974, 'text': 'A growing employment index would, at first glance, suggest that things are getting better for most people.', 'start': 961.912, 'duration': 5.062}, {'end': 972.857, 'text': "However, seeing the average employment wage trends in isolation isn't all that useful.", 'start': 967.794, 'duration': 5.063}, {'end': 979.041, 'text': "After all, the highest salary won't do you much good if your basic expenses are higher still.", 'start': 973.498, 'duration': 5.543}, {'end': 986.085, 'text': 'So the goal is to pull both the CPI and wages data sets, and then correlate them looking for patterns.', 'start': 979.741, 'duration': 6.344}, {'end': 990.388, 'text': 'This will show us how wages have been changing in relation to costs.', 'start': 986.726, 'duration': 3.662}, {'end': 994.088, 'text': 'Now let me show you how it actually works.', 'start': 992.006, 'duration': 2.082}], 'summary': 'Cpi and wages data are correlated to gauge changes in cost of living and economic health.', 'duration': 50.246, 'max_score': 937.597, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo937597.jpg'}, {'end': 1130.524, 'src': 'heatmap', 'start': 1102.193, 'weight': 1, 'content': [{'end': 1108.115, 'text': 'Running just CPI data will print out the first and last five lines of the data frame.', 'start': 1102.193, 'duration': 5.922}, {'end': 1115.058, 'text': 'The date column contains month and year values, and the second column contains our actual data.', 'start': 1108.756, 'duration': 6.302}, {'end': 1118.839, 'text': "I'd like to simplify the headers to make them easier to work with.", 'start': 1115.818, 'duration': 3.021}, {'end': 1121.46, 'text': "So I'll use the pandas columns attribute.", 'start': 1119.18, 'duration': 2.28}, {'end': 1123.181, 'text': 'I definitely prefer it this way.', 'start': 1121.961, 'duration': 1.22}, {'end': 1130.524, 'text': "However, we'll need to also see the wages data to know whether the formatted uses is compatible with our CPI set.", 'start': 1123.501, 'duration': 7.023}], 'summary': 'Cpi data shows month/year values; simplifying headers with pandas columns attribute. need to check compatibility with wages data.', 'duration': 28.331, 'max_score': 1102.193, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1102193.jpg'}], 'start': 875.188, 'title': 'Accessing bls api and analyzing cpi and wage data using python', 'summary': 'Discusses using the bls python library to access bls api, emphasizing the convenience of using the third-party library and the process of obtaining and activating the bls api key. it also demonstrates retrieving and analyzing consumer price index (cpi) and wage and salary statistics data between 2002 and 2020 using python, highlighting the importance of correlating both datasets to understand changes in wages in relation to costs.', 'chapters': [{'end': 920.789, 'start': 875.188, 'title': 'Using bls python library for accessing bls api', 'summary': 'Discusses using the bls python library to access bls api, emphasizing the avoidance of manual get and put requests and the convenience of using the third-party library, along with the process of obtaining and activating the bls api key.', 'duration': 45.601, 'highlights': ["You can avoid manual writing of get and put requests by using the third party Python library called bls, available through Oliver sheroes's GitHub repo.", 'Activating the BLS API key involves registering for the API, receiving an email with the key and a validation URL, and exporting it to the system environment.', 'The library can be installed on the host machine using pip install bls.']}, {'end': 1254.47, 'start': 920.789, 'title': 'Analyzing cpi and wage data', 'summary': 'Demonstrates how to retrieve and analyze consumer price index (cpi) and wage and salary statistics data between 2002 and 2020 using python, emphasizing the importance of correlating both datasets to understand changes in wages in relation to costs.', 'duration': 333.681, 'highlights': ['The CPI is a measure of the price of a basket of essential consumer goods, serving as a proxy for changes in the cost of living and an indicator of the general health of the economy.', 'Correlating CPI and wages data allows for understanding how wages have been changing in relation to costs.', 'The transcript provides a step-by-step demonstration of retrieving and processing the CPI and wages data using Python libraries such as pandas, NumPy, and matplotlib.']}], 'duration': 379.282, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo875188.jpg', 'highlights': ['The CPI is a measure of the price of a basket of essential consumer goods, serving as a proxy for changes in the cost of living and an indicator of the general health of the economy.', 'Correlating CPI and wages data allows for understanding how wages have been changing in relation to costs.', 'Activating the BLS API key involves registering for the API, receiving an email with the key and a validation URL, and exporting it to the system environment.', "You can avoid manual writing of get and put requests by using the third party Python library called bls, available through Oliver sheroes's GitHub repo.", 'The library can be installed on the host machine using pip install bls.']}, {'end': 1838.865, 'segs': [{'end': 1278.393, 'src': 'embed', 'start': 1254.95, 'weight': 1, 'content': [{'end': 1263.555, 'text': 'The data in the CPI set comes in absolute point values, while the wages are reported in percentages measuring growth.', 'start': 1254.95, 'duration': 8.605}, {'end': 1267.178, 'text': "As is, there's no way to accurately compare them.", 'start': 1264.556, 'duration': 2.622}, {'end': 1269.184, 'text': 'For one thing.', 'start': 1268.103, 'duration': 1.081}, {'end': 1278.393, 'text': 'each row of our wages data is the percentage by which wages would have risen that quarter had the current rate continued for a full 12 months.', 'start': 1269.184, 'duration': 9.209}], 'summary': 'Cpi data in points, wages in percentage growth make comparison difficult.', 'duration': 23.443, 'max_score': 1254.95, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1254950.jpg'}, {'end': 1540.033, 'src': 'embed', 'start': 1510.482, 'weight': 2, 'content': [{'end': 1511.122, 'text': "Let's take a look.", 'start': 1510.482, 'duration': 0.64}, {'end': 1512.823, 'text': 'Our data is all there.', 'start': 1511.642, 'duration': 1.181}, {'end': 1519.125, 'text': 'We could visually scan through the CPI and wages columns and look for any unusual relationships.', 'start': 1513.303, 'duration': 5.822}, {'end': 1520.826, 'text': 'But that defeats the point.', 'start': 1519.646, 'duration': 1.18}, {'end': 1525.228, 'text': 'Python data analytics is all about letting our code do that for us.', 'start': 1521.426, 'duration': 3.802}, {'end': 1526.769, 'text': "Let's plot the thing.", 'start': 1525.768, 'duration': 1.001}, {'end': 1533.491, 'text': "Here, we'll tell plot to take our merge data frame merge data and create a bar chart.", 'start': 1527.609, 'duration': 5.882}, {'end': 1540.033, 'text': "Because there's an awful lot of data here, I'll extend the size of the chart with a manual fig size value.", 'start': 1534.031, 'duration': 6.002}], 'summary': 'Using python data analytics to visually analyze cpi and wages data, creating a bar chart with an extended size.', 'duration': 29.551, 'max_score': 1510.482, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1510482.jpg'}, {'end': 1634.547, 'src': 'embed', 'start': 1606.607, 'weight': 3, 'content': [{'end': 1611.691, 'text': "We'll also talk a bit about how regression lines work, and what kinds of insights they can show us.", 'start': 1606.607, 'duration': 5.084}, {'end': 1613.813, 'text': "We'll begin with scatter plots.", 'start': 1612.472, 'duration': 1.341}, {'end': 1620.618, 'text': 'This code is from the property rights and economic development chapter on my teach yourself data analytics website.', 'start': 1614.273, 'duration': 6.345}, {'end': 1623.48, 'text': 'You can catch up on the background over there.', 'start': 1621.139, 'duration': 2.341}, {'end': 1634.547, 'text': "But the code you're looking at comes from two data sources the World Bank's measure of per capita gross domestic product by country and the index of economic freedom,", 'start': 1624.181, 'duration': 10.366}], 'summary': 'Exploring regression lines and scatter plots for economic data analysis.', 'duration': 27.94, 'max_score': 1606.607, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1606607.jpg'}, {'end': 1799.042, 'src': 'embed', 'start': 1771.244, 'weight': 0, 'content': [{'end': 1774.667, 'text': "There's only so much we can assume based on visually viewing a graph.", 'start': 1771.244, 'duration': 3.423}, {'end': 1778.37, 'text': "At some point, we'll need hard numbers to describe what we're looking at.", 'start': 1774.847, 'duration': 3.523}, {'end': 1787.236, 'text': 'A simple linear regression analysis can give us a measure of the strength of the relationship between a dependent variable and the data model.', 'start': 1779.111, 'duration': 8.125}, {'end': 1794.42, 'text': 'R squared is a number between zero and 100%, where 100% would indicate a perfect fit.', 'start': 1787.896, 'duration': 6.524}, {'end': 1799.042, 'text': 'Of course, in the real world, a 100% fit is next to impossible.', 'start': 1794.7, 'duration': 4.342}], 'summary': 'Visual analysis has limitations; linear regression gives measure of relationship strength, with r squared indicating fit, but 100% fit is rare.', 'duration': 27.798, 'max_score': 1771.244, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1771244.jpg'}], 'start': 1254.95, 'title': 'Analyzing cpi and wages data with python', 'summary': 'Discusses the incompatibility between cpi point values and wage growth percentages, proposing a method to adjust quarterly growth rates and fake math to convert percentages to match cpi values. it also covers merging data frames, plotting bar and line charts, exploring scatter plots, and adding regression lines to visualize relationships between economic indicators, with a strong correlation of around 55% revealed through a linear regression analysis.', 'chapters': [{'end': 1459.871, 'start': 1254.95, 'title': 'Comparing cpi and wages data', 'summary': 'Discusses the incompatibility between cpi point values and wage growth percentages, proposing a method to adjust quarterly growth rates and fake math to convert percentages to match cpi values.', 'duration': 204.921, 'highlights': ['The incompatibility between CPI point values and wage growth percentages is addressed, proposing a method to adjust quarterly growth rates and fake math to convert percentages to match CPI values.', 'The need to adjust wage growth percentages to accurately compare them with CPI point values is emphasized, with the proposal of dividing each quarterly growth rate by four to account for the actual growth during the specific time period.', 'A method involving fake math is proposed to convert percentages to match CPI values, using a function to adjust the original CPI value based on the related wage growth percentage.', 'The chapter highlights the use of arbitrary approximations to adjust the numbers for the purpose of comparison, acknowledging that it may not directly reflect the real world but deems it close enough for the intended analysis.']}, {'end': 1838.865, 'start': 1460.311, 'title': 'Python data analytics: merging, plotting, and regression', 'summary': 'Covers merging data frames, plotting bar and line charts, exploring scatter plots, and adding regression lines to visualize relationships between economic indicators, with a strong correlation of around 55% revealed through a linear regression analysis.', 'duration': 378.554, 'highlights': ['A linear regression analysis reveals a strong correlation of around 55% between economic indicators, indicating a visible trend up and to the right on the scatter plot.', 'Exploring scatter plots and adding regression lines to visualize the statistical relationship between economic indicators, and identifying outliers with the help of plotly tools.', "Merging data frames, plotting bar and line charts, and using Python's data analytics capabilities to compare CPI and wages, revealing higher growth rates in wages than in the CPI."]}], 'duration': 583.915, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1254950.jpg', 'highlights': ['A linear regression analysis reveals a strong correlation of around 55% between economic indicators.', 'The need to adjust wage growth percentages to accurately compare them with CPI point values is emphasized.', "Merging data frames, plotting bar and line charts, and using Python's data analytics capabilities to compare CPI and wages.", 'Exploring scatter plots and adding regression lines to visualize the statistical relationship between economic indicators.']}, {'end': 2298.248, 'segs': [{'end': 1914.84, 'src': 'embed', 'start': 1889.327, 'weight': 2, 'content': [{'end': 1898.709, 'text': 'My goal was to visualize the distribution of their birth dates across all 12 months to see if their births were concentrated within a specific yearly season.', 'start': 1889.327, 'duration': 9.382}, {'end': 1904.133, 'text': "When I displayed the data using a histogram, we didn't see the pattern we'd expected.", 'start': 1899.71, 'duration': 4.423}, {'end': 1907.695, 'text': "In fact, the pattern wasn't truly representative of the real world.", 'start': 1904.373, 'duration': 3.322}, {'end': 1914.84, 'text': "That's because histograms are great for showing frequency distributions by grouping data points together into bins.", 'start': 1908.596, 'duration': 6.244}], 'summary': 'Visualized birth date distribution across 12 months with histogram, revealing unexpected pattern.', 'duration': 25.513, 'max_score': 1889.327, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1889327.jpg'}, {'end': 2021.378, 'src': 'embed', 'start': 1956.077, 'weight': 0, 'content': [{'end': 1965.714, 'text': "we're going to talk about understanding our data visualizations and integrating what we see in our Jupyter notebooks with stuff that happens out there in the real world.", 'start': 1956.077, 'duration': 9.637}, {'end': 1966.916, 'text': 'Stay tuned.', 'start': 1966.395, 'duration': 0.521}, {'end': 1970.813, 'text': "We're supposed to be doing data analytics here.", 'start': 1968.271, 'duration': 2.542}, {'end': 1974.776, 'text': "So just staring at pretty graphs probably isn't the whole point.", 'start': 1971.133, 'duration': 3.643}, {'end': 1983.484, 'text': 'The CPI and wages data sets we plotted in the previous chapter, for instance, showed us a clear general correlation.', 'start': 1975.577, 'duration': 7.907}, {'end': 1986.606, 'text': 'But there were some visually recognizable anomalies.', 'start': 1984.024, 'duration': 2.582}, {'end': 1994.112, 'text': 'Unless we can connect those anomalies with historical events and explain them in a historical context,', 'start': 1987.187, 'duration': 6.925}, {'end': 1996.534, 'text': "we won't be getting the full value from our data.", 'start': 1994.112, 'duration': 2.422}, {'end': 2004.202, 'text': 'But even before going there, we should confirm that our plots actually make sense in the context of their data sources.', 'start': 1997.555, 'duration': 6.647}, {'end': 2014.532, 'text': "Working with our BLS examples, let's look at graphs to compare CPI and wages data from both before and after our manipulation.", 'start': 2005.002, 'duration': 9.53}, {'end': 2021.378, 'text': "That way, we can be sure that our math and particularly our fake math didn't skew things too badly.", 'start': 2015.212, 'duration': 6.166}], 'summary': 'Integrating data visualizations with real-world events is crucial for deriving full value from data; confirming plot accuracy is essential.', 'duration': 65.301, 'max_score': 1956.077, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1956077.jpg'}, {'end': 2080.018, 'src': 'embed', 'start': 2048.143, 'weight': 1, 'content': [{'end': 2051.043, 'text': 'Now, how about the wages data here?', 'start': 2048.143, 'duration': 2.9}, {'end': 2053.905, 'text': 'because we move from percentages to currency.', 'start': 2051.043, 'duration': 2.862}, {'end': 2059.107, 'text': 'the transformation was more intrusive and the risks of misrepresentation were greater.', 'start': 2053.905, 'duration': 5.202}, {'end': 2065.71, 'text': "We'll also need to take into account the way a percentage will display differently from an absolute value.", 'start': 2059.947, 'duration': 5.763}, {'end': 2067.63, 'text': "Here's the original data.", 'start': 2066.37, 'duration': 1.26}, {'end': 2073.054, 'text': "Note how there's no consistent curve, either upwards or downwards.", 'start': 2068.712, 'duration': 4.342}, {'end': 2080.018, 'text': "That's because we're measuring the rate of growth as it took place within each individual quarter, not the growth itself.", 'start': 2073.375, 'duration': 6.643}], 'summary': 'Transforming wages data from percentages to currency posed greater risks of misrepresentation, with no consistent growth curve due to measuring rate of growth within each individual quarter.', 'duration': 31.875, 'max_score': 2048.143, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo2048143.jpg'}, {'end': 2177.514, 'src': 'embed', 'start': 2154.182, 'weight': 8, 'content': [{'end': 2164.766, 'text': "the full curriculum is available on my thedataproject.net site and you're more than welcome to join all the cool kids over there and be in touch if you've got something to add to the conversation.", 'start': 2154.182, 'duration': 10.584}, {'end': 2172.09, 'text': "The main thing is to realize that the end of this course isn't anywhere near the end of your data analytics education.", 'start': 2165.646, 'duration': 6.444}, {'end': 2177.514, 'text': "Watching me calmly execute nice, clean code samples isn't really learning.", 'start': 2172.771, 'duration': 4.743}], 'summary': 'Dataproject.net offers full curriculum, emphasizing ongoing education beyond the course.', 'duration': 23.332, 'max_score': 2154.182, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo2154182.jpg'}, {'end': 2245.945, 'src': 'embed', 'start': 2216.624, 'weight': 7, 'content': [{'end': 2223.349, 'text': 'But the more problems I faced and overcame, the deeper the process sank into my mind, and the better I got at it.', 'start': 2216.624, 'duration': 6.725}, {'end': 2227.151, 'text': 'And so will you just be prepared for tough times ahead.', 'start': 2223.889, 'duration': 3.262}, {'end': 2230.094, 'text': 'Before you all run off and get on with your day.', 'start': 2227.952, 'duration': 2.142}, {'end': 2233.316, 'text': "Let's spend a moment or two reviewing everything we saw here.", 'start': 2230.554, 'duration': 2.762}, {'end': 2238.479, 'text': 'We spoke about the many ways you can work with Jupyter notebooks,', 'start': 2233.836, 'duration': 4.643}, {'end': 2245.945, 'text': "including online platforms like Google's collaboratory and locally hosting either Jupyter lab or classic notebooks.", 'start': 2238.479, 'duration': 7.466}], 'summary': 'Overcoming problems led to deeper understanding; discussed working with jupyter notebooks.', 'duration': 29.321, 'max_score': 2216.624, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo2216624.jpg'}, {'end': 2289.265, 'src': 'embed', 'start': 2261.723, 'weight': 6, 'content': [{'end': 2265.226, 'text': 'Python libraries and modules were our next focus,', 'start': 2261.723, 'duration': 3.503}, {'end': 2271.611, 'text': 'including how to import appropriate libraries to allow us to effectively clean and manipulate our data.', 'start': 2265.226, 'duration': 6.385}, {'end': 2279.843, 'text': 'And finally, turning to some actual data analytics, we learned some basics of plotting, including working with scatter plots,', 'start': 2272.441, 'duration': 7.402}, {'end': 2281.543, 'text': 'regression lines and histograms.', 'start': 2279.843, 'duration': 1.7}, {'end': 2289.265, 'text': 'And we closed out the course with a quick discussion of how to use our data visualizations to integrate our insights with the real world.', 'start': 2282.124, 'duration': 7.141}], 'summary': 'Learned python libraries/modules, data cleaning, plotting basics, and integrating insights with real world.', 'duration': 27.542, 'max_score': 2261.723, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo2261723.jpg'}], 'start': 1845.58, 'title': 'Data visualization and analytics overview', 'summary': 'Explains using histograms for data visualization, addressing challenges, and the need for alternative techniques. it also covers the importance of historical context, visualization, and topics such as jupyter notebooks and python libraries in the data analytics course.', 'chapters': [{'end': 2112.17, 'start': 1845.58, 'title': 'Histograms and data visualization', 'summary': 'Explains how histograms are used to visualize data distributions, highlighting the challenges in representing specific patterns and the need for alternative visualization techniques. it also emphasizes the importance of contextual understanding and validation of data visualizations for meaningful insights.', 'duration': 266.59, 'highlights': ['The histogram did not display the expected pattern of NHL player birth dates, indicating its limitations in representing specific events to calendar dates.', 'Visualization using a plain bar graph was found to be more suitable for representing the value counts of birth dates, addressing the limitations of the histogram.', 'The need to integrate data visualizations with real-world events and historical context is emphasized for obtaining comprehensive insights from the data.', 'Validation of data plots is crucial, as seen in the comparison of CPI and wages data before and after manipulation to ensure the accuracy and meaningful representation of the data.', 'The transformation of wages data from percentages to currency base values highlighted the differences in visualization and interpretation due to the scale variation.']}, {'end': 2298.248, 'start': 2118.16, 'title': 'Data analytics course overview', 'summary': 'Covers the importance of historical context in data analysis, the value of visualization in understanding data, and the various topics covered in the course, including working with jupyter notebooks, python libraries, and data visualization techniques.', 'duration': 180.088, 'highlights': ['The chapter emphasizes the importance of establishing historical context for data analysis and the value of visualization in understanding data.', 'The course covered topics such as working with Jupyter notebooks, Python libraries and modules, and data visualization techniques.', 'The speaker encourages independent learning by mentioning the availability of the full curriculum on thedataproject.net and invites further engagement with the data analytics community.', 'The speaker shares personal experiences of overcoming challenges in data analysis, highlighting the value of perseverance and learning from mistakes.']}], 'duration': 452.668, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jcTj6FgWOpo/pics/jcTj6FgWOpo1845580.jpg', 'highlights': ['Validation of data plots is crucial, as seen in the comparison of CPI and wages data before and after manipulation to ensure the accuracy and meaningful representation of the data.', 'The transformation of wages data from percentages to currency base values highlighted the differences in visualization and interpretation due to the scale variation.', 'Visualization using a plain bar graph was found to be more suitable for representing the value counts of birth dates, addressing the limitations of the histogram.', 'The need to integrate data visualizations with real-world events and historical context is emphasized for obtaining comprehensive insights from the data.', 'The histogram did not display the expected pattern of NHL player birth dates, indicating its limitations in representing specific events to calendar dates.', 'The chapter emphasizes the importance of establishing historical context for data analysis and the value of visualization in understanding data.', 'The course covered topics such as working with Jupyter notebooks, Python libraries and modules, and data visualization techniques.', 'The speaker shares personal experiences of overcoming challenges in data analysis, highlighting the value of perseverance and learning from mistakes.', 'The speaker encourages independent learning by mentioning the availability of the full curriculum on thedataproject.net and invites further engagement with the data analytics community.']}], 'highlights': ['David Clinton introduces a 30-day data analytics course to learn data analytics in 30 days.', 'The course provides tools for finding and manipulating raw data and using various graphing tools to interpret it, emphasizing the need for hands-on experience for real learning.', 'Python ecosystem offers extensive resources for industrial-strength data analysis', 'Jupyter serves as an open source platform for executing Python code and is renowned for its effectiveness in Python data analysis and visualization', 'Accessing public APIs for data retrieval and analysis, focusing on analyzing wages and standard of living over the past 20 years from US government data', 'The CPI is a measure of the price of a basket of essential consumer goods, serving as a proxy for changes in the cost of living and an indicator of the general health of the economy.', 'A linear regression analysis reveals a strong correlation of around 55% between economic indicators.', 'Validation of data plots is crucial, as seen in the comparison of CPI and wages data before and after manipulation to ensure the accuracy and meaningful representation of the data.', 'The transformation of wages data from percentages to currency base values highlighted the differences in visualization and interpretation due to the scale variation.', 'The need to integrate data visualizations with real-world events and historical context is emphasized for obtaining comprehensive insights from the data.']}