title
Modern Dictionaries by Raymond Hettinger
description
Abstract
Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk is given by Raymond Hettinger, the Python core developer responsible for the set implementation and who designed the compact-and-ordered dict implemented in CPython for Python 3.6 and in PyPy for Python 2.7. He will use pictures and little bits of pure python code to explain all of the key ideas and how they evolved over time. He will also include newer features such as key-sharing, compaction, and versioning. This talk is important because it is the only public discussion of the state of the art as of Python 3.6. Even experienced Python users are unlikely to know the most recent innovations.
Who and Why (Audience):
This talk is for all Python programmers. It is designed to be fully understandable for a beginner (it starts from first principles) but to have new information even for Python experts (how key-sharing works, how the compact-ordered patch works, how dict versioning works). At the end of this talk, you can confidently say that you know how modern Python dictionaries work and what it means for your code.
Bio
Raymond Hettinger has also served as a director of the Python Software Foundation, and has mentored many people over the years on their contributions to the python-dev community. He is also well known for his contributions to the Python Cookbook, and shares many pieces of Python wisdom on Twitter. He is a frequent keynote speaker at Python Conferences around the world and has received the Distinguished Service Award at PyCon 2014 for his exceptional contributions to the python community.
Other info:
This talk is delivered at SF Python's 2nd Annual Holiday Party for Python Devs in SF Bay Area, CA. In you are in San Francisco area looking to meet other python devs, please check our schedule for meetups on http://sfpython.org
detail
{'title': 'Modern Dictionaries by Raymond Hettinger', 'heatmap': [{'end': 325.523, 'start': 282.571, 'weight': 1}], 'summary': "Raymond hettinger's video discusses the evolution of dictionaries in programming, python dictionaries, data structure optimization, efficient hash table implementation, space-efficient data representation, python 3.6 dictionary improvements, linear probing, cache line optimization, and python optimization, emphasizing memory efficiency, performance improvements, and the impact on python versions 1.5 to 3.6.", 'chapters': [{'end': 143.515, 'segs': [{'end': 41.174, 'src': 'embed', 'start': 8.885, 'weight': 1, 'content': [{'end': 11.486, 'text': "Let's see, something festive to talk about.", 'start': 8.885, 'duration': 2.601}, {'end': 12.566, 'text': 'How about dictionaries?', 'start': 11.726, 'duration': 0.84}, {'end': 16.748, 'text': 'How many of you like dictionaries?', 'start': 15.587, 'duration': 1.161}, {'end': 20.089, 'text': 'Are they modern and cool?', 'start': 18.968, 'duration': 1.121}, {'end': 24.91, 'text': 'How long have they been around hash tables?', 'start': 23.13, 'duration': 1.78}, {'end': 31.633, 'text': 'How long is forever? Over 30 years.', 'start': 27.511, 'duration': 4.122}, {'end': 37.895, 'text': "The art of computer programming, Donald Knuth, Algorithm D, it's been in Python for 26 years.", 'start': 31.913, 'duration': 5.982}, {'end': 38.615, 'text': 'Is there anything new?', 'start': 37.915, 'duration': 0.7}, {'end': 41.174, 'text': "Now nothing's changed?", 'start': 39.793, 'duration': 1.381}], 'summary': 'Discussion on dictionaries and their longevity, with over 30 years in existence and 26 years in python.', 'duration': 32.289, 'max_score': 8.885, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG88885.jpg'}, {'end': 143.515, 'src': 'embed', 'start': 88.115, 'weight': 0, 'content': [{'end': 90.957, 'text': 'How many of you would rather use a database than a Python dictionary?', 'start': 88.115, 'duration': 2.842}, {'end': 93.499, 'text': "That's too bad.", 'start': 92.778, 'duration': 0.721}, {'end': 95.42, 'text': "That's where our journey is going to end up.", 'start': 93.899, 'duration': 1.521}, {'end': 98.609, 'text': "All right, so let's see.", 'start': 96.187, 'duration': 2.422}, {'end': 103.073, 'text': 'Hosted by Microsoft Reactor.', 'start': 101.651, 'duration': 1.422}, {'end': 104.674, 'text': 'Thank you very much, Microsoft.', 'start': 103.193, 'duration': 1.481}, {'end': 107.396, 'text': 'It is wonderful to donate this space and this time.', 'start': 104.794, 'duration': 2.602}, {'end': 119.402, 'text': "I've always been a fan of Microsoft, What they have done for our community has just been spectacular.", 'start': 112.98, 'duration': 6.422}, {'end': 123.544, 'text': 'Thank Grace and Simeon, conference organizers, and thank you, Ben.', 'start': 119.782, 'duration': 3.762}, {'end': 131.468, 'text': 'My mission is to train thousands of Python programmers, not figuratively, but literally.', 'start': 125.485, 'duration': 5.983}, {'end': 139.553, 'text': "I've just crossed the 5, 000 number that I've trained personally, and then my team has cumulatively trained over 10, 000 Python programmers.", 'start': 131.729, 'duration': 7.824}, {'end': 141.314, 'text': "And we're not talking about an hour or two training.", 'start': 139.853, 'duration': 1.461}, {'end': 143.515, 'text': "We're talking week-long training sessions.", 'start': 141.334, 'duration': 2.181}], 'summary': 'Training thousands of python programmers, 5,000 trained personally, over 10,000 by team.', 'duration': 55.4, 'max_score': 88.115, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG888115.jpg'}], 'start': 8.885, 'title': 'Evolution of dictionaries in programming', 'summary': "Discusses the longevity of hash tables, continuous improvements in technology, and the speaker's mission to train thousands of python programmers, having personally trained over 5,000 and their team cumulatively trained over 10,000 in week-long training sessions.", 'chapters': [{'end': 143.515, 'start': 8.885, 'title': 'Evolution of dictionaries in programming', 'summary': 'Discusses the evolution of dictionaries in programming, highlighting the longevity of hash tables, the continuous improvements in technology, and the necessity for ongoing learning. the speaker shares their mission to train thousands of python programmers, having personally trained over 5,000 and their team cumulatively trained over 10,000, in week-long training sessions.', 'duration': 134.63, 'highlights': ['The speaker has personally trained over 5,000 Python programmers, and their team has cumulatively trained over 10,000 in week-long training sessions.', 'The art of computer programming, Donald Knuth, Algorithm D, has been in Python for 26 years, emphasizing the longevity and integration of hash tables in programming.', 'The chapter emphasizes the necessity for ongoing learning in the field of programming, highlighting the ever-evolving nature of technology and the potential obsoletion of outdated knowledge.', 'The speaker expresses gratitude towards Microsoft for donating space and time, and the significant impact of Microsoft on the programming community.', 'The chapter discusses the evolution of dictionaries in programming, highlighting the longevity of hash tables and the continuous improvements in technology.']}], 'duration': 134.63, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG88885.jpg', 'highlights': ['The speaker has personally trained over 5,000 Python programmers, and their team has cumulatively trained over 10,000 in week-long training sessions.', 'The art of computer programming, Donald Knuth, Algorithm D, has been in Python for 26 years, emphasizing the longevity and integration of hash tables in programming.', 'The chapter discusses the evolution of dictionaries in programming, highlighting the longevity of hash tables and the continuous improvements in technology.', 'The chapter emphasizes the necessity for ongoing learning in the field of programming, highlighting the ever-evolving nature of technology and the potential obsoletion of outdated knowledge.', 'The speaker expresses gratitude towards Microsoft for donating space and time, and the significant impact of Microsoft on the programming community.']}, {'end': 802.491, 'segs': [{'end': 325.523, 'src': 'heatmap', 'start': 282.571, 'weight': 1, 'content': [{'end': 286.214, 'text': 'A lot of people like to get to the dictionary for an instance by doing a dunderdict.', 'start': 282.571, 'duration': 3.643}, {'end': 287.815, 'text': "There's a function for getting there.", 'start': 286.534, 'duration': 1.281}, {'end': 290.351, 'text': "What's it called? VARS.", 'start': 287.855, 'duration': 2.496}, {'end': 294.274, 'text': 'How many of you knew VARS is there? VARS is an unloved function.', 'start': 290.371, 'duration': 3.903}, {'end': 295.335, 'text': 'You should start to love it.', 'start': 294.434, 'duration': 0.901}, {'end': 298.257, 'text': "It's really kind of awesome once you get used to it.", 'start': 295.475, 'duration': 2.782}, {'end': 299.839, 'text': "But everyone forgets that it's there.", 'start': 298.418, 'duration': 1.421}, {'end': 304.002, 'text': 'And that will just simply print the three instance dictionaries.', 'start': 300.159, 'duration': 3.843}, {'end': 308.445, 'text': "And this part is, well, I'm the iter tools guy, so I'm very functional.", 'start': 304.542, 'duration': 3.903}, {'end': 312.889, 'text': 'I do a map of a size of a map of a list of a print, that sort of thing.', 'start': 308.505, 'duration': 4.384}, {'end': 315.631, 'text': 'Makes me feel very functional and happy.', 'start': 313.869, 'duration': 1.762}, {'end': 319.799, 'text': 'So what do we get? We get three instance dictionaries.', 'start': 316.556, 'duration': 3.243}, {'end': 325.523, 'text': "Guido's blue, Sarah's orange, Guido's Austin, Sarah's Dallas, Guido's apple, Sarah is banana.", 'start': 319.919, 'duration': 5.604}], 'summary': 'Vars function is underutilized, but useful for accessing instance dictionaries and promoting functional programming.', 'duration': 42.952, 'max_score': 282.571, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8282571.jpg'}, {'end': 354.865, 'src': 'embed', 'start': 328.386, 'weight': 4, 'content': [{'end': 334.911, 'text': 'The point of this is this is a stand-in for basically every important Python program that runs on the face of the planet.', 'start': 328.386, 'duration': 6.525}, {'end': 341.276, 'text': "Somewhere there's a class, lots of instances, and they have lots of dictionaries, suggesting that dictionaries are important to us.", 'start': 335.271, 'duration': 6.005}, {'end': 345.12, 'text': "I'm going to pick these to talk about during the class.", 'start': 341.897, 'duration': 3.223}, {'end': 351.103, 'text': "So let's jump right to the end, the punch line in Python 2.7.", 'start': 345.76, 'duration': 5.343}, {'end': 354.865, 'text': 'How big are each of these dictionaries? 280 bytes each.', 'start': 351.103, 'duration': 3.762}], 'summary': 'Python programs rely heavily on dictionaries, with each dictionary being 280 bytes in python 2.7.', 'duration': 26.479, 'max_score': 328.386, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8328386.jpg'}, {'end': 434.213, 'src': 'embed', 'start': 404.505, 'weight': 2, 'content': [{'end': 408.425, 'text': 'When Mark got to it, he realized that Kuito has been repeated three times.', 'start': 404.505, 'duration': 3.92}, {'end': 412.906, 'text': "Is the world big enough for three Kuitos? I don't think so.", 'start': 408.605, 'duration': 4.301}, {'end': 414.486, 'text': 'So Mark got this bright idea.', 'start': 413.086, 'duration': 1.4}, {'end': 419.307, 'text': 'How about we store Kuito once, his hash once, and then only save the data separately?', 'start': 414.727, 'duration': 4.58}, {'end': 424.148, 'text': "So this was called the Key Sharing Dictionary and you've already got it in Python 3.5..", 'start': 419.607, 'duration': 4.541}, {'end': 429.189, 'text': "Who thinks that's kind of cool? Mark is a smart man, which is why they gave him a PhD.", 'start': 424.148, 'duration': 5.041}, {'end': 434.213, 'text': 'Then Python 3.6, they got a little smaller.', 'start': 431.172, 'duration': 3.041}], 'summary': 'Mark proposed a key sharing dictionary to store kuito once, his hash once, and then only save the data separately, already available in python 3.5. python 3.6 made further improvements.', 'duration': 29.708, 'max_score': 404.505, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8404505.jpg'}, {'end': 477.153, 'src': 'embed', 'start': 451.079, 'weight': 0, 'content': [{'end': 456.922, 'text': "By the way, how many of you think that's a pretty impressive savings? And by the way, this is not a random obscure corner of Python.", 'start': 451.079, 'duration': 5.843}, {'end': 458.242, 'text': 'Dictionaries are everywhere.', 'start': 457.102, 'duration': 1.14}, {'end': 460.063, 'text': "You're going to get a massive savings.", 'start': 458.562, 'duration': 1.501}, {'end': 463.549, 'text': "Now I'm a big pan of Python 2.7.", 'start': 460.728, 'duration': 2.821}, {'end': 469.811, 'text': "That said, for the last seven years, I've devoted all of my evenings and weekends to making Python 3 better.", 'start': 463.549, 'duration': 6.262}, {'end': 477.153, 'text': 'And so I would have loved to have come out and told you how awesome Python 3 was all the time, but I never did that until today.', 'start': 470.671, 'duration': 6.482}], 'summary': 'Python 3 savings are massive, after 7 years of improvement.', 'duration': 26.074, 'max_score': 451.079, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8451079.jpg'}, {'end': 595.478, 'src': 'embed', 'start': 567.912, 'weight': 1, 'content': [{'end': 570.373, 'text': "When I pound on the table, you're supposed to say, there must be a better way.", 'start': 567.912, 'duration': 2.461}, {'end': 573.453, 'text': 'Ready? There must be a better way.', 'start': 570.413, 'duration': 3.04}, {'end': 574.654, 'text': 'There is a better way.', 'start': 573.593, 'duration': 1.061}, {'end': 576.274, 'text': 'Scrambled is better than ordered.', 'start': 574.894, 'duration': 1.38}, {'end': 579.115, 'text': 'And by Python 3.5, we have the SIP hash.', 'start': 576.354, 'duration': 2.761}, {'end': 582.115, 'text': 'So the hashes get randomized every time you turn on Python.', 'start': 579.415, 'duration': 2.7}, {'end': 585.356, 'text': 'And so occasionally, Sarah gets to go first and Tim goes second.', 'start': 582.455, 'duration': 2.901}, {'end': 588.476, 'text': 'But once in a while, one time in five, Guido gets to go first.', 'start': 585.396, 'duration': 3.08}, {'end': 590.917, 'text': 'Every time you turn it on, it scrambles.', 'start': 588.716, 'duration': 2.201}, {'end': 595.478, 'text': "So everyone's doc tests who relied on the order, what happened to them? Broken.", 'start': 591.197, 'duration': 4.281}], 'summary': 'Python 3.5 introduces sip hash, randomizing hashes on startup, causing 1 in 5 times for guido to go first, breaking doc tests relying on order.', 'duration': 27.566, 'max_score': 567.912, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8567912.jpg'}], 'start': 144.302, 'title': 'Python dictionaries', 'summary': 'Emphasizes the importance of dictionaries in python, highlighting their pervasive use and their evolution from python 2.7 to python 3, with a focus on memory efficiency and the transition to ordered handling.', 'chapters': [{'end': 239.913, 'start': 144.302, 'title': 'Python: the language of dictionaries', 'summary': 'Emphasizes the importance of dictionaries in python, highlighting their pervasive use and their role as the foundation of the language, running across different versions from 2.6 to 3.7.', 'duration': 95.611, 'highlights': ["Python's extensive use of dictionaries is emphasized, with various components of the language being described as thin wrappers around dictionaries, illustrating their fundamental importance.", "The speaker's code is confirmed to run under different Python versions, including 2.6, 2.7, 3.0, and 3.7, showcasing the versatility and compatibility of the discussed concepts.", 'The significance of dictionaries in Python is highlighted, with the speaker describing them as pervasive and essential throughout the language, forming the backbone of its functionality.']}, {'end': 802.491, 'start': 239.973, 'title': 'Evolution of python dictionaries', 'summary': 'Explores the evolution of python dictionaries, highlighting their importance, size optimization, and the transition from python 2.7 to python 3, with a focus on memory efficiency and the transformation of dictionary handling from being unordered to ordered.', 'duration': 562.518, 'highlights': ['In Python 2.7, the size of each dictionary is 280 bytes, but in Python 3.5, it has been reduced to 196 bytes, showcasing significant memory savings.', 'The Key Sharing Dictionary concept introduced in Python 3.5 by Mark Shannon optimizes memory by storing repetitive keys and hashes only once, resulting in a substantial reduction in dictionary size.', 'The transition to Python 3.6 further reduces dictionary size through compaction, resulting in dictionaries being less than half their original size, demonstrating impressive memory savings and efficiency.', 'Python 3.5 introduces the SIP hash, ensuring randomized hashes with every Python restart, leading to a shift from unordered to ordered dictionaries for improved predictability and reliability of dictionary behavior.']}], 'duration': 658.189, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8144302.jpg', 'highlights': ['Python 3.6 reduces dictionary size through compaction, resulting in dictionaries being less than half their original size, demonstrating impressive memory savings and efficiency.', 'Python 3.5 introduces the SIP hash, ensuring randomized hashes with every Python restart, leading to a shift from unordered to ordered dictionaries for improved predictability and reliability of dictionary behavior.', 'The Key Sharing Dictionary concept introduced in Python 3.5 by Mark Shannon optimizes memory by storing repetitive keys and hashes only once, resulting in a substantial reduction in dictionary size.', 'In Python 2.7, the size of each dictionary is 280 bytes, but in Python 3.5, it has been reduced to 196 bytes, showcasing significant memory savings.', 'The significance of dictionaries in Python is highlighted, with the speaker describing them as pervasive and essential throughout the language, forming the backbone of its functionality.']}, {'end': 1419.265, 'segs': [{'end': 881.596, 'src': 'embed', 'start': 851.275, 'weight': 0, 'content': [{'end': 851.995, 'text': 'We can do better.', 'start': 851.275, 'duration': 0.72}, {'end': 855.53, 'text': 'We decided to speed it up a little bit.', 'start': 853.749, 'duration': 1.781}, {'end': 858.031, 'text': 'The problem with what we were doing here is it had a linear search.', 'start': 855.61, 'duration': 2.421}, {'end': 866.356, 'text': "If you're looking for poor Rachel, you had to go down four steps in order to find out that she has the pear and she's associated with Reno.", 'start': 858.432, 'duration': 7.924}, {'end': 871.179, 'text': 'Better way is separate chaining.', 'start': 869.678, 'duration': 1.501}, {'end': 876.142, 'text': 'We make a set of buckets and then each bucket has a list and we grow the list.', 'start': 871.639, 'duration': 4.503}, {'end': 881.596, 'text': "And now, that's going to reduce the linear search by a constant factor.", 'start': 877.594, 'duration': 4.002}], 'summary': 'Implementing separate chaining reduces linear search by a constant factor, improving efficiency.', 'duration': 30.321, 'max_score': 851.275, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8851275.jpg'}, {'end': 1059.882, 'src': 'embed', 'start': 1033.943, 'weight': 1, 'content': [{'end': 1041.125, 'text': "We have a new record for wasted memory space, but we've sped up our search dramatically, several hundred percent.", 'start': 1033.943, 'duration': 7.182}, {'end': 1046.377, 'text': 'There is another way.', 'start': 1045.438, 'duration': 0.939}, {'end': 1047.919, 'text': "It's called open addressing.", 'start': 1046.698, 'duration': 1.221}, {'end': 1055.241, 'text': 'In open addressing rather than having the extra space at the end of each of these lists so that we can grow the list.', 'start': 1048.419, 'duration': 6.822}, {'end': 1059.882, 'text': 'the idea is we make one big table to begin with and then we insert everything into that one table.', 'start': 1055.241, 'duration': 4.641}], 'summary': 'New record for wasted memory space, but search speed increased by several hundred percent through open addressing.', 'duration': 25.939, 'max_score': 1033.943, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81033943.jpg'}, {'end': 1178.75, 'src': 'embed', 'start': 1149.793, 'weight': 3, 'content': [{'end': 1155.197, 'text': "In other words, it doesn't speed up the best case, it just keeps the worst case from being catastrophic.", 'start': 1149.793, 'duration': 5.404}, {'end': 1158.993, 'text': 'Fair enough? However, Uncle Timmy comes along.', 'start': 1155.697, 'duration': 3.296}, {'end': 1164.298, 'text': 'Tim Peters, the hero of this story, and Tim gets this bright idea.', 'start': 1160.074, 'duration': 4.224}, {'end': 1171.744, 'text': 'Tim says, we know all about random number generators, and a very simple one is a linear congruential random number generator.', 'start': 1164.818, 'duration': 6.926}, {'end': 1178.75, 'text': 'Take a number, multiply it by five, add one, and take it a modulo, some fixed modulus.', 'start': 1172.185, 'duration': 6.565}], 'summary': 'Tim peters proposes using a linear congruential random number generator to prevent worst-case scenarios.', 'duration': 28.957, 'max_score': 1149.793, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81149793.jpg'}, {'end': 1376.768, 'src': 'embed', 'start': 1334.011, 'weight': 4, 'content': [{'end': 1340.476, 'text': "So if you were using Python 2.7 or before, What the dictionaries would do when they resize is they wouldn't double.", 'start': 1334.011, 'duration': 6.465}, {'end': 1343.298, 'text': 'when they needed more space, they would quadruple.', 'start': 1340.476, 'duration': 2.822}, {'end': 1348.882, 'text': 'So it meant most dictionaries most of the time had enough space in it to where you had hardly any collisions.', 'start': 1343.658, 'duration': 5.224}, {'end': 1354.57, 'text': 'All I did was throw away some space, but I topped out at about 50, 000 entries.', 'start': 1349.827, 'duration': 4.743}, {'end': 1356.851, 'text': 'Above 50, 000 entries, we go back to doubling.', 'start': 1354.71, 'duration': 2.141}, {'end': 1359.392, 'text': 'Once it got up to a size that you would actually notice.', 'start': 1357.271, 'duration': 2.121}, {'end': 1364.715, 'text': "So when you're not looking for all the little small nickel and dimes, I was quadrupling your space.", 'start': 1359.733, 'duration': 4.982}, {'end': 1368.037, 'text': "But if you quadruple a tiny dictionary, that's not a lot.", 'start': 1365.116, 'duration': 2.921}, {'end': 1371.539, 'text': 'Then Mark Shannon came along and said, this is terrible.', 'start': 1368.958, 'duration': 2.581}, {'end': 1372.9, 'text': 'And he turned it back to doubling.', 'start': 1371.839, 'duration': 1.061}, {'end': 1376.768, 'text': 'So you got that about Python 3.2.', 'start': 1373.567, 'duration': 3.201}], 'summary': 'In python 2.7, dictionaries would quadruple in size before 50,000 entries, leading to reduced collisions, but was changed back to doubling in python 3.2.', 'duration': 42.757, 'max_score': 1334.011, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81334011.jpg'}], 'start': 802.975, 'title': 'Data structure optimization and evolution in python', 'summary': 'Discusses techniques for optimizing data structures, such as separate chaining and increased buckets, leading to improvements in memory usage and search speed. it also highlights the evolution of dictionary structures in python, including the impact of introducing multiple hashings, a random number generator, and resizing of dictionaries on performance and space efficiency, and their effects on collision rates, system stability, and performance in python versions 1.5 to 3.2.', 'chapters': [{'end': 1107.154, 'start': 802.975, 'title': 'Data structure optimization', 'summary': 'Discusses the optimization of data structures for memory usage and search speed, highlighting the improvements in memory usage and search speed achieved through techniques like separate chaining, increased buckets, and open addressing.', 'duration': 304.179, 'highlights': ['Open addressing technique for data structure optimization The open addressing technique is discussed as a method to optimize data structure, eliminating overallocation and achieving improved memory usage.', "Improvement in search speed through increased buckets and separate chaining The technique of increasing the number of buckets and using separate chaining is highlighted as a means to dramatically speed up search, with Rachel's search improving by 300%.", 'Impact of techniques on memory usage and search speed The chapter discusses the trade-offs between memory usage and search speed, showcasing the significant improvements in search speed achieved through various techniques.']}, {'end': 1419.265, 'start': 1107.434, 'title': 'Evolution of dictionary structure in python', 'summary': 'Delves into the evolution of dictionary structures in python, highlighting the impact of introducing multiple hashings, a random number generator, and the resizing of dictionaries on performance and space efficiency, as well as the contributions of key individuals such as tim peters and mark shannon. it discusses how these changes affected collision rates, system stability, and performance in python versions 1.5 to 3.2.', 'duration': 311.831, 'highlights': ["The introduction of a random number generator by Tim Peters perturbed shift in additional bits, ensuring that the random number sequence would over time visit every single separate cell in the table, preventing system deadlock and catastrophic linear pileup. Tim Peters introduced a random number generator that perturbed shift in additional bits, guaranteeing the random number sequence's visit to every single separate cell in the table, thus preventing system deadlock and catastrophic linear pileup.", 'The resizing of dictionaries in Python 2.7 involved quadrupling the space when needed, significantly reducing collisions for most dictionaries, with a threshold of 50,000 entries, before Mark Shannon reverted it back to doubling in Python 3.2. In Python 2.7, dictionary resizing involved quadrupling the space when needed, leading to a significant reduction in collisions for most dictionaries, until Mark Shannon reverted it back to doubling in Python 3.2, above the threshold of 50,000 entries.', 'The evolution of dictionary structures in Python was marked by a tug of war between improving performance and saving space, with contributions from individuals like Tim Peters and Mark Shannon. The evolution of dictionary structures in Python showcased a tug of war between improving performance and saving space, with significant contributions from individuals such as Tim Peters and Mark Shannon.']}], 'duration': 616.29, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG8802975.jpg', 'highlights': ["The technique of increasing the number of buckets and using separate chaining dramatically speeds up search, with Rachel's search improving by 300%.", 'The open addressing technique is discussed as a method to optimize data structure, eliminating overallocation and achieving improved memory usage.', 'The chapter discusses the trade-offs between memory usage and search speed, showcasing the significant improvements in search speed achieved through various techniques.', "Tim Peters introduced a random number generator that perturbed shift in additional bits, guaranteeing the random number sequence's visit to every single separate cell in the table, thus preventing system deadlock and catastrophic linear pileup.", 'In Python 2.7, dictionary resizing involved quadrupling the space when needed, leading to a significant reduction in collisions for most dictionaries, until Mark Shannon reverted it back to doubling in Python 3.2, above the threshold of 50,000 entries.', 'The evolution of dictionary structures in Python showcased a tug of war between improving performance and saving space, with significant contributions from individuals such as Tim Peters and Mark Shannon.']}, {'end': 1825.639, 'segs': [{'end': 1448.157, 'src': 'embed', 'start': 1420.137, 'weight': 0, 'content': [{'end': 1423.68, 'text': 'And so those of you who are Pi Pi users have been having the CompactDict.', 'start': 1420.137, 'duration': 3.543}, {'end': 1427.284, 'text': 'But prior to that was KeysharingDict.', 'start': 1424.201, 'duration': 3.083}, {'end': 1430.507, 'text': "So the temporal anomaly is I'll show you CompactDict first.", 'start': 1427.444, 'duration': 3.063}, {'end': 1433.71, 'text': "Why? It's cleaner to show it to you this way.", 'start': 1431.007, 'duration': 2.703}, {'end': 1443.679, 'text': 'So CompactDict looks very much like the other dict except that I pull all of the indices out and the hash table is now this is an array of bytes.', 'start': 1434.25, 'duration': 9.429}, {'end': 1448.157, 'text': 'So this is one byte, one byte, one byte, one byte, one byte.', 'start': 1444.594, 'duration': 3.563}], 'summary': 'Pi pi users transitioned from keysharingdict to compactdict for cleaner representation, with hash table as an array of bytes.', 'duration': 28.02, 'max_score': 1420.137, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81420137.jpg'}, {'end': 1663.727, 'src': 'embed', 'start': 1618.674, 'weight': 2, 'content': [{'end': 1620.635, 'text': 'So it takes one cycle later.', 'start': 1618.674, 'duration': 1.961}, {'end': 1624.677, 'text': 'So it takes a total of four clock cycles to pull in this entire table.', 'start': 1620.755, 'duration': 3.922}, {'end': 1629.54, 'text': "Who thinks that's kind of awesome? And if you compared it to what we had before, this thing was huge.", 'start': 1624.877, 'duration': 4.663}, {'end': 1632.601, 'text': 'Remember the smallest dictionary was 280 bytes, not things like eight.', 'start': 1629.6, 'duration': 3.001}, {'end': 1639.299, 'text': 'So that was my contribution.', 'start': 1637.458, 'duration': 1.841}, {'end': 1641.459, 'text': 'I proposed it about four years ago.', 'start': 1639.359, 'duration': 2.1}, {'end': 1645.881, 'text': 'I brought it to a Python dev and somebody, this kind of seems good.', 'start': 1641.96, 'duration': 3.921}, {'end': 1649.202, 'text': 'Mark 10 is like, you go back to the drawing board, you study more.', 'start': 1645.941, 'duration': 3.261}, {'end': 1654.504, 'text': "And when one of the smartest people on the planet tells you to go back to the brawling board, you know, it's all over.", 'start': 1649.682, 'duration': 4.822}, {'end': 1657.465, 'text': 'And Quito expressed no excitement whatsoever.', 'start': 1654.944, 'duration': 2.521}, {'end': 1663.727, 'text': 'What should you do? Crawl into your cave in mid defeat or take a back door? Back door.', 'start': 1658.025, 'duration': 5.702}], 'summary': 'Proposed table compression reduces size from 280 to 8 bytes.', 'duration': 45.053, 'max_score': 1618.674, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81618674.jpg'}, {'end': 1716.469, 'src': 'embed', 'start': 1689.795, 'weight': 8, 'content': [{'end': 1695.636, 'text': 'So this pure Python code run through PyPy gave you the compact dict right away.', 'start': 1689.795, 'duration': 5.841}, {'end': 1700.161, 'text': 'So they went and studied it and said, this seems like a reasonably good idea.', 'start': 1697.099, 'duration': 3.062}, {'end': 1703.542, 'text': 'And they did something that felt really good.', 'start': 1700.181, 'duration': 3.361}, {'end': 1709.765, 'text': 'Do you remember when Tim invented his own sort routine? What did he call it? Tim sort.', 'start': 1704.423, 'duration': 5.342}, {'end': 1713.227, 'text': 'And then I put so much effort into designing this thing.', 'start': 1710.406, 'duration': 2.821}, {'end': 1716.469, 'text': 'Do you know what it was called? The Raymond Dict.', 'start': 1713.527, 'duration': 2.942}], 'summary': 'Pypy generated compact dict, studied and implemented by team, including tim sort and raymond dict.', 'duration': 26.674, 'max_score': 1689.795, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81689795.jpg'}, {'end': 1781.714, 'src': 'embed', 'start': 1731.918, 'weight': 5, 'content': [{'end': 1738.764, 'text': 'So it no longer got called the Raymond Dict, it got idea proposed by Raymond with a significant prior work in Java.', 'start': 1731.918, 'duration': 6.846}, {'end': 1741.066, 'text': 'Not a Raymond Dict anymore.', 'start': 1740.025, 'duration': 1.041}, {'end': 1743.107, 'text': 'Java Dict.', 'start': 1742.607, 'duration': 0.5}, {'end': 1747.831, 'text': "That's okay, we're going back in time to Java.", 'start': 1745.209, 'duration': 2.622}, {'end': 1754.496, 'text': 'This is the part of the journey where we stop going forward, we start going backward.', 'start': 1750.653, 'duration': 3.843}, {'end': 1756.928, 'text': 'Backward for two reasons.', 'start': 1755.888, 'duration': 1.04}, {'end': 1760.809, 'text': 'One is I presented this out of order.', 'start': 1757.048, 'duration': 3.761}, {'end': 1764.43, 'text': 'In fact, prior to the compact dict was the key sharing dict.', 'start': 1761.509, 'duration': 2.921}, {'end': 1766.631, 'text': 'I already discussed the concept of key sharing.', 'start': 1764.85, 'duration': 1.781}, {'end': 1774.212, 'text': 'Instead of putting quito in three times for each instance, how about we store as key one time the hash one time and only save the values?', 'start': 1766.851, 'duration': 7.361}, {'end': 1775.633, 'text': 'This is wonderful.', 'start': 1774.753, 'duration': 0.88}, {'end': 1781.714, 'text': "It means every time you have an instance, all it's storing in the instances are just the values and not the keys.", 'start': 1775.713, 'duration': 6.001}], 'summary': 'Proposed java dict, revisiting key sharing concept, optimizing storage.', 'duration': 49.796, 'max_score': 1731.918, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81731918.jpg'}], 'start': 1420.137, 'title': 'Efficient hash table implementation and python dict development', 'summary': 'Delves into compactdict, a new hash table implementation reducing space wastage and enhancing efficiency with 8 bytes for hashing, resulting in 2 memory fetches and 4 clock cycles for 128 entries. it also explores the development journey of python dict, from initial concept to key sharing dict, emphasizing the importance of collaboration in software development.', 'chapters': [{'end': 1641.459, 'start': 1420.137, 'title': 'Compactdict: efficient hash table implementation', 'summary': 'Discusses the compactdict, a new hash table implementation which reduces space wastage and improves efficiency by only utilizing 8 bytes for hashing part, resulting in 2 memory fetches and 4 clock cycles for a dictionary of 128 entries.', 'duration': 221.322, 'highlights': ['The CompactDict implementation reduces space wastage by utilizing only 8 bytes for the hashing part, significantly smaller than previous structures.', 'The hash table part of CompactDict takes only 8 bytes, compared to the previous structure which was at least 280 bytes for the smallest dictionary.', 'For a dictionary of 128 entries, the CompactDict implementation results in only 2 memory fetches to pull in the entire hash table, taking a total of 4 clock cycles, significantly improving efficiency.', 'The throughput of the memory fetches for the CompactDict implementation is one cycle, resulting in a total of four clock cycles to pull in the entire table, making it much more efficient than previous structures.']}, {'end': 1825.639, 'start': 1641.96, 'title': 'Python dict development journey', 'summary': 'Discusses the journey of developing python dict implementation, from initial concept to collaboration with others and the eventual evolution into key sharing dict, emphasizing the significance of studying and collaborating in software development.', 'duration': 183.679, 'highlights': ["Mark Shannon's key sharing dict and the initial compact dict development are highlighted as significant milestones in the Python dict journey, showcasing the iterative process of software development and the importance of collaboration.", 'The concept of key sharing dict, which saves memory by storing keys only once and values multiple times, is introduced as a significant improvement in the Python dict implementation, emphasizing its efficiency and impact on memory usage.', "The collaborative nature of software development is emphasized through the author's interaction with other developers, such as Maciek and Armin, highlighting the value of seeking feedback and input from peers in the development process.", "The evolution of the author's proposed 'Raymond Dict' into 'Java Dict' due to prior work in Java is highlighted, showcasing the adaptive and iterative nature of software development and the impact of prior knowledge and contributions from the community.", "The chapter emphasizes the significance of studying and collaborating in software development, as demonstrated by the author's experience with seeking feedback, iterative development, and the influence of prior work in the evolution of Python dict implementation."]}], 'duration': 405.502, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81420137.jpg', 'highlights': ['The CompactDict implementation reduces space wastage by utilizing only 8 bytes for the hashing part, significantly smaller than previous structures.', 'The hash table part of CompactDict takes only 8 bytes, compared to the previous structure which was at least 280 bytes for the smallest dictionary.', 'For a dictionary of 128 entries, the CompactDict implementation results in only 2 memory fetches to pull in the entire hash table, taking a total of 4 clock cycles, significantly improving efficiency.', 'The throughput of the memory fetches for the CompactDict implementation is one cycle, resulting in a total of four clock cycles to pull in the entire table, making it much more efficient than previous structures.', "Mark Shannon's key sharing dict and the initial compact dict development are highlighted as significant milestones in the Python dict journey, showcasing the iterative process of software development and the importance of collaboration.", 'The concept of key sharing dict, which saves memory by storing keys only once and values multiple times, is introduced as a significant improvement in the Python dict implementation, emphasizing its efficiency and impact on memory usage.', "The collaborative nature of software development is emphasized through the author's interaction with other developers, such as Maciek and Armin, highlighting the value of seeking feedback and input from peers in the development process.", "The evolution of the author's proposed 'Raymond Dict' into 'Java Dict' due to prior work in Java is highlighted, showcasing the adaptive and iterative nature of software development and the impact of prior knowledge and contributions from the community.", "The chapter emphasizes the significance of studying and collaborating in software development, as demonstrated by the author's experience with seeking feedback, iterative development, and the influence of prior work in the evolution of Python dict implementation."]}, {'end': 2162.217, 'segs': [{'end': 1902.494, 'src': 'embed', 'start': 1851.099, 'weight': 0, 'content': [{'end': 1853, 'text': "We still check to see if the table's not none.", 'start': 1851.099, 'duration': 1.901}, {'end': 1856.721, 'text': 'But what we insert into the table is just the position.', 'start': 1853.28, 'duration': 3.441}, {'end': 1859.383, 'text': 'And then we append the entry.', 'start': 1857.782, 'duration': 1.601}, {'end': 1860.683, 'text': "And here's what it looks like.", 'start': 1859.583, 'duration': 1.1}, {'end': 1866.126, 'text': 'Now we have Cueto, Blue, Austin, and Apple.', 'start': 1862.504, 'duration': 3.622}, {'end': 1868.139, 'text': 'all of these in one table.', 'start': 1866.619, 'duration': 1.52}, {'end': 1870.4, 'text': 'Notice Quito is only mentioned one time.', 'start': 1868.36, 'duration': 2.04}, {'end': 1874.061, 'text': 'Notice each of these fields have one time.', 'start': 1872.341, 'duration': 1.72}, {'end': 1876.062, 'text': "Notice there's no space in between them.", 'start': 1874.161, 'duration': 1.901}, {'end': 1879.002, 'text': 'The only thing that indexes them is this little table.', 'start': 1876.482, 'duration': 2.52}, {'end': 1885.464, 'text': "How many bytes wide is this? I'm looking for some number that is eight times one.", 'start': 1879.203, 'duration': 6.261}, {'end': 1887.525, 'text': 'I know the drinks have hit hard.', 'start': 1886.365, 'duration': 1.16}, {'end': 1891.446, 'text': 'This thing is only eight bytes to index this table.', 'start': 1888.465, 'duration': 2.981}, {'end': 1896.269, 'text': "Now, what if I, ooh, I didn't change the number here.", 'start': 1894.127, 'duration': 2.142}, {'end': 1898.13, 'text': 'That one should be 16.', 'start': 1896.329, 'duration': 1.801}, {'end': 1902.494, 'text': 'Interestingly, I can make the table more sparse, reduce collisions and speed it up.', 'start': 1898.13, 'duration': 4.364}], 'summary': 'Table with 4 entries, 8 bytes wide, can be made more sparse for faster indexing.', 'duration': 51.395, 'max_score': 1851.099, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81851099.jpg'}, {'end': 2004.672, 'src': 'embed', 'start': 1954.265, 'weight': 3, 'content': [{'end': 1962.108, 'text': 'We can now make the dictionaries much more sparse, arbitrarily sparse, to eliminate collisions without throwing away any memory.', 'start': 1954.265, 'duration': 7.843}, {'end': 1967.293, 'text': 'Who learned something new? Of course you learn something new.', 'start': 1963.708, 'duration': 3.585}, {'end': 1968.514, 'text': 'This is all brand new.', 'start': 1967.413, 'duration': 1.101}, {'end': 1970.456, 'text': "You don't get it until you get Python 3.6.", 'start': 1968.594, 'duration': 1.862}, {'end': 1972.218, 'text': 'It comes out in a week.', 'start': 1970.456, 'duration': 1.762}, {'end': 1973.379, 'text': 'Get it now.', 'start': 1972.758, 'duration': 0.621}, {'end': 1974.36, 'text': "It's worth it.", 'start': 1973.799, 'duration': 0.561}, {'end': 1975.741, 'text': "It's awesome.", 'start': 1974.98, 'duration': 0.761}, {'end': 1980.025, 'text': "How many of you think I'm really darn clever coming up with the compact dict?", 'start': 1976.261, 'duration': 3.764}, {'end': 1989.814, 'text': "How many of you think that Mark Shannon was really really awesome when he came up with key sharing so that we didn't have multiple acuitos?", 'start': 1982.807, 'duration': 7.007}, {'end': 2001.289, 'text': 'And this is the state of the art of the technology in 2017, modulo and extra thing put in by Victor Stinner that I will mention shortly.', 'start': 1992.403, 'duration': 8.886}, {'end': 2004.672, 'text': "And now we've progressed so far.", 'start': 2002.35, 'duration': 2.322}], 'summary': 'Python 3.6 introduces sparse dictionaries, key sharing, and other advancements in 2017 technology.', 'duration': 50.407, 'max_score': 1954.265, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81954265.jpg'}, {'end': 2135.897, 'src': 'embed', 'start': 2084.147, 'weight': 1, 'content': [{'end': 2090.288, 'text': "And what hasn't happened before is someone never came along and said what we knew about databases a long,", 'start': 2084.147, 'duration': 6.141}, {'end': 2096.768, 'text': 'long time ago could be applied now to object oriented world, where you have many instances.', 'start': 2090.288, 'duration': 6.48}, {'end': 2099.647, 'text': 'Now imagine I get 100 instances.', 'start': 2097.225, 'duration': 2.422}, {'end': 2103.45, 'text': 'All this thing is going to do is grow out to the right.', 'start': 2099.907, 'duration': 3.543}, {'end': 2107.533, 'text': "The index table itself isn't going to change.", 'start': 2103.53, 'duration': 4.003}, {'end': 2111.776, 'text': 'So if you have 10, 000 instances, this grows to the right.', 'start': 2108.394, 'duration': 3.382}, {'end': 2115.619, 'text': 'This thing stays the same width, and it still has the same lookup speed.', 'start': 2112.056, 'duration': 3.563}, {'end': 2120.463, 'text': 'And at this size, each person has exactly one probe, no hash collision.', 'start': 2116.12, 'duration': 4.343}, {'end': 2125.008, 'text': 'Imagine 10, 000 instances, no redundancy,', 'start': 2120.945, 'duration': 4.063}, {'end': 2132.554, 'text': 'fully as dense as a database and the lookup table into it makes it fast loads in a single cache line with zero collisions.', 'start': 2125.008, 'duration': 7.546}, {'end': 2135.897, 'text': 'Welcome to 2016.', 'start': 2133.115, 'duration': 2.782}], 'summary': 'Database technology applied to object-oriented world with 10,000 instances in 2016.', 'duration': 51.75, 'max_score': 2084.147, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82084147.jpg'}], 'start': 1825.959, 'title': 'Space-efficient data representation', 'summary': 'Introduces shared and compact tables for concise data representation, and discusses optimizing dictionary size and performance in python 3.6, leading to improved lookup speed and efficiency.', 'chapters': [{'end': 1876.062, 'start': 1825.959, 'title': 'Shared and compact table', 'summary': 'Introduces the concept of shared and compact tables, demonstrating how to insert positions and append entries into a single table, resulting in a concise and space-efficient representation of data.', 'duration': 50.103, 'highlights': ['The chapter introduces the concept of shared and compact tables, demonstrating how to insert positions and append entries into a single table, resulting in a concise and space-efficient representation of data.', 'The table creation process involves looping over positions and entries while utilizing Tim Peters randomization logic for insertion, resulting in an efficient data structure.', 'The inserted data results in a table containing entries such as Cueto, Blue, Austin, and Apple, emphasizing the space-efficient nature of the compact table.']}, {'end': 2162.217, 'start': 1876.482, 'title': 'Compact dictionary and key sharing in python 3.6', 'summary': 'Discusses the optimization of dictionary size and performance in python 3.6, highlighting the benefits of increasing sparseness, reducing collisions, and minimizing memory usage, ultimately leading to improved lookup speed and efficiency.', 'duration': 285.735, 'highlights': ['The size of the index table for the dictionary is optimized, with the ability to make it more sparse, reducing collisions and memory usage, while maintaining single cache line fetch for improved efficiency. Optimizing index table size, increasing sparseness, reducing collisions, maintaining single cache line fetch', 'The development of compact dict and key sharing in Python 3.6 is emphasized, showcasing the advancements in technology and the potential impact on programming efficiency. Introduction of compact dict and key sharing, technological advancements, impact on programming efficiency', 'The historical context of the technology behind dictionaries and databases is discussed, highlighting the application of past knowledge to the modern object-oriented world and the potential for improved performance and efficiency. Application of historical technology to modern object-oriented world, potential for improved performance and efficiency']}], 'duration': 336.258, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG81825959.jpg', 'highlights': ['The chapter introduces the concept of shared and compact tables, demonstrating how to insert positions and append entries into a single table, resulting in a concise and space-efficient representation of data.', 'The size of the index table for the dictionary is optimized, with the ability to make it more sparse, reducing collisions and memory usage, while maintaining single cache line fetch for improved efficiency.', 'The table creation process involves looping over positions and entries while utilizing Tim Peters randomization logic for insertion, resulting in an efficient data structure.', 'The development of compact dict and key sharing in Python 3.6 is emphasized, showcasing the advancements in technology and the potential impact on programming efficiency.', 'The historical context of the technology behind dictionaries and databases is discussed, highlighting the application of past knowledge to the modern object-oriented world and the potential for improved performance and efficiency.', 'The inserted data results in a table containing entries such as Cueto, Blue, Austin, and Apple, emphasizing the space-efficient nature of the compact table.', 'Optimizing index table size, increasing sparseness, reducing collisions, maintaining single cache line fetch', 'Introduction of compact dict and key sharing, technological advancements, impact on programming efficiency', 'Application of historical technology to modern object-oriented world, potential for improved performance and efficiency']}, {'end': 2653.027, 'segs': [{'end': 2269.485, 'src': 'embed', 'start': 2240.529, 'weight': 0, 'content': [{'end': 2247.715, 'text': "So when this next fix gets put in, hopefully in some point in 3.7,, what it's going to mean is that the cost of resizing is,", 'start': 2240.529, 'duration': 7.186}, {'end': 2250.016, 'text': "we're not going to have to move almost any of that data.", 'start': 2247.715, 'duration': 2.301}, {'end': 2256.299, 'text': 'The only thing that needs to be zeroed is that little byte index table, which can be done in a couple of clock cycles.', 'start': 2250.036, 'duration': 6.263}, {'end': 2259.56, 'text': 'In other words, resizing will become basically free.', 'start': 2256.679, 'duration': 2.881}, {'end': 2269.485, 'text': "One other benefit that's kind of cool is, you see this thing here that says iter? What does it iterate over? It iterates over the key list.", 'start': 2260.601, 'duration': 8.884}], 'summary': 'In the next fix for 3.7, resizing will become almost free, with only the byte index table needing to be zeroed.', 'duration': 28.956, 'max_score': 2240.529, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82240529.jpg'}, {'end': 2311.063, 'src': 'embed', 'start': 2284.473, 'weight': 1, 'content': [{'end': 2291.635, 'text': 'Is this table got a key or is an empty cable or empty? But now all of those keys are all pushed to the left side.', 'start': 2284.473, 'duration': 7.162}, {'end': 2292.595, 'text': "They're all dense.", 'start': 2291.855, 'duration': 0.74}, {'end': 2294.556, 'text': 'So all 10 entries are in order.', 'start': 2292.675, 'duration': 1.881}, {'end': 2300.418, 'text': "When we finish up improving this one, you'll be able to iterate over a dictionary as fast as a list.", 'start': 2295.816, 'duration': 4.602}, {'end': 2304.798, 'text': 'Why? Because it is a list inside.', 'start': 2301.198, 'duration': 3.6}, {'end': 2307.04, 'text': "It's all consecutive pointers.", 'start': 2305.078, 'duration': 1.962}, {'end': 2309.041, 'text': 'Not a single branch misprediction.', 'start': 2307.48, 'duration': 1.561}, {'end': 2311.063, 'text': 'Not a single wasted space.', 'start': 2309.862, 'duration': 1.201}], 'summary': 'Optimized table structure achieves 10x faster iteration over a dictionary than before.', 'duration': 26.59, 'max_score': 2284.473, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82284473.jpg'}, {'end': 2399.777, 'src': 'embed', 'start': 2342.248, 'weight': 2, 'content': [{'end': 2345.713, 'text': 'And then finally last, it only uses big pointers.', 'start': 2342.248, 'duration': 3.465}, {'end': 2353.926, 'text': 'In other words, the table automatically scales the size density of the indices based on how big your table is.', 'start': 2346.161, 'duration': 7.765}, {'end': 2358.248, 'text': "Most dictionaries, most of the time, are really small, so they're one byte each.", 'start': 2354.426, 'duration': 3.822}, {'end': 2362.631, 'text': "And that's my story, and I'm sticking with it.", 'start': 2359.289, 'duration': 3.342}, {'end': 2378.45, 'text': 'One other righteous dude needs to be mentioned, Victor Stinner.', 'start': 2374.409, 'duration': 4.041}, {'end': 2382.031, 'text': 'Victor Stinner came up with a fairly neat idea.', 'start': 2378.91, 'duration': 3.121}, {'end': 2386.093, 'text': "A lot of times when we're looping through our programs, we get an attribute,", 'start': 2382.051, 'duration': 4.042}, {'end': 2389.394, 'text': "we go look it up and there's a dictionary lookup to go find that attribute.", 'start': 2386.093, 'duration': 3.301}, {'end': 2393.695, 'text': "And possibly there's a chain of lookups when you have an inheritance hierarchy.", 'start': 2389.914, 'duration': 3.781}, {'end': 2399.777, 'text': 'What if we were to mark each of these dictionaries with a dirty bit to indicate when it was changed?', 'start': 2394.395, 'duration': 5.382}], 'summary': 'Automatically scales size density of indices based on table size, dictionaries are one byte each, and uses a dirty bit to indicate changes.', 'duration': 57.529, 'max_score': 2342.248, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82342248.jpg'}, {'end': 2547.967, 'src': 'embed', 'start': 2502.876, 'weight': 4, 'content': [{'end': 2507.258, 'text': 'and that we actually got a net benefit from fewer memory accesses and better cache performance.', 'start': 2502.876, 'duration': 4.382}, {'end': 2511.765, 'text': 'But since the algorithms were the same, it was mostly about space savings.', 'start': 2507.503, 'duration': 4.262}, {'end': 2517.108, 'text': 'That said, when you make something smaller, your cash employment improves, and you automatically get some win.', 'start': 2512.105, 'duration': 5.003}, {'end': 2519.829, 'text': "So I believe it's a few percent in terms of speed.", 'start': 2517.428, 'duration': 2.401}, {'end': 2521.49, 'text': "But mostly it's about space.", 'start': 2520.129, 'duration': 1.361}, {'end': 2524.471, 'text': "And it'll really pay off when you get hit with the big data.", 'start': 2521.99, 'duration': 2.481}, {'end': 2529.613, 'text': "You'll find the big data, the size of the data, is the size of how much space it actually takes.", 'start': 2524.771, 'duration': 4.842}, {'end': 2543.464, 'text': 'David, what you got? What happens if you delete an entry in your dictionary and add another entry? Do you lose your order? I never delete.', 'start': 2529.654, 'duration': 13.81}, {'end': 2547.967, 'text': 'So this is a good question.', 'start': 2546.426, 'duration': 1.541}], 'summary': 'Algorithm improvement led to space savings and speed increase of a few percent, particularly beneficial for big data.', 'duration': 45.091, 'max_score': 2502.876, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82502876.jpg'}, {'end': 2629.011, 'src': 'embed', 'start': 2606.048, 'weight': 6, 'content': [{'end': 2617.023, 'text': 'Cool. Then, when Inada came along to implement this in C, he went ahead and put the compaction back in restoring order, as they did in Apai Pai,', 'start': 2606.048, 'duration': 10.975}, {'end': 2618.885, 'text': 'which was a very simple modification.', 'start': 2617.023, 'duration': 1.862}, {'end': 2621.846, 'text': 'And I think ordering is a big, big win.', 'start': 2619.605, 'duration': 2.241}, {'end': 2629.011, 'text': "That said, it's not guaranteed until you all decide you like it, until you all defy Kuito's will and start coming to rely on it.", 'start': 2622.387, 'duration': 6.624}], 'summary': 'Inada implemented compaction in c, restoring order like in apai pai, a simple modification with potential benefits.', 'duration': 22.963, 'max_score': 2606.048, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82606048.jpg'}], 'start': 2162.257, 'title': 'Python 3.6 dictionary improvements', 'summary': 'Discusses the space savings, resizing optimization, and faster iteration in python 3.6 dictionaries, highlighting near-free resizing cost and elimination of dictionary lookups. it also covers the impact on python performance, achieving a few percent improvement in speed and significant space savings through reduced memory accesses and better cache performance, and addressing the implications of deleting entries in a dictionary and trade-offs between ordered and scrambled data.', 'chapters': [{'end': 2452.451, 'start': 2162.257, 'title': 'Python 3.6 dictionary improvements', 'summary': 'Discusses the improvements made in python 3.6 dictionaries, including space savings, resizing optimization, and faster iteration, with the key highlight being the near-free cost of resizing and elimination of dictionary lookups.', 'duration': 290.194, 'highlights': ['The resizing optimization in Python 3.6 results in a near-free cost of resizing as it eliminates the need to move almost any data and only requires zeroing out the byte index table, making resizing basically free.', 'The faster iteration in Python 3.6 dictionaries is achieved by reordering the keys, resulting in consecutive pointers and no branch mispredictions, allowing iteration as fast as a list.', 'Python 3.6 dictionaries utilize automatic scaling of size density based on the table size, using byte arrays for small tables and transitioning to word size and long arrays for larger tables.', "The introduction of a 'dirty bit' by Victor Stinner in Python 3.6 dictionaries allows for the elimination of lookups by marking dictionaries to indicate when they were changed, saving space and reducing unnecessary lookups."]}, {'end': 2653.027, 'start': 2462.614, 'title': 'Python performance improvements', 'summary': 'Discusses the impact of reducing memory size on python performance, with a few percent improvement in speed and significant space savings, achieved through algorithms with reduced memory accesses and better cache performance, while also addressing the implications of deleting entries in a dictionary and the trade-offs between ordered and scrambled data.', 'duration': 190.413, 'highlights': ['The algorithms with reduced memory accesses and better cache performance led to a few percent improvement in speed and significant space savings. The change in algorithms resulted in a net benefit from fewer memory accesses and better cache performance, leading to a few percent improvement in speed and significant space savings.', 'The implications of deleting entries in a dictionary were discussed, emphasizing the trade-offs between preserving order and achieving better performance. The discussion highlighted the trade-offs between preserving order and achieving better performance when deleting entries in a dictionary, addressing the implications of compaction and the trade-offs between ordered and scrambled data.', 'The decision to implement ordering in Python was a significant win but not yet guaranteed, with the audience encouraged to embrace and rely on it to solidify its future. The implementation of ordering in Python was deemed a significant win, although not yet guaranteed, with the audience encouraged to embrace and rely on it to solidify its future as a standard feature.']}], 'duration': 490.77, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82162257.jpg', 'highlights': ['The resizing optimization in Python 3.6 results in a near-free cost of resizing as it eliminates the need to move almost any data and only requires zeroing out the byte index table, making resizing basically free.', 'The faster iteration in Python 3.6 dictionaries is achieved by reordering the keys, resulting in consecutive pointers and no branch mispredictions, allowing iteration as fast as a list.', 'Python 3.6 dictionaries utilize automatic scaling of size density based on the table size, using byte arrays for small tables and transitioning to word size and long arrays for larger tables.', "The introduction of a 'dirty bit' by Victor Stinner in Python 3.6 dictionaries allows for the elimination of lookups by marking dictionaries to indicate when they were changed, saving space and reducing unnecessary lookups.", 'The algorithms with reduced memory accesses and better cache performance led to a few percent improvement in speed and significant space savings. The change in algorithms resulted in a net benefit from fewer memory accesses and better cache performance, leading to a few percent improvement in speed and significant space savings.', 'The implications of deleting entries in a dictionary were discussed, emphasizing the trade-offs between preserving order and achieving better performance. The discussion highlighted the trade-offs between preserving order and achieving better performance when deleting entries in a dictionary, addressing the implications of compaction and the trade-offs between ordered and scrambled data.', 'The decision to implement ordering in Python was a significant win but not yet guaranteed, with the audience encouraged to embrace and rely on it to solidify its future. The implementation of ordering in Python was deemed a significant win, although not yet guaranteed, with the audience encouraged to embrace and rely on it to solidify its future as a standard feature.']}, {'end': 3363.478, 'segs': [{'end': 2743.84, 'src': 'embed', 'start': 2716.937, 'weight': 1, 'content': [{'end': 2721.14, 'text': "Bhushan said, When data gets big, it doesn't fit into a cache anymore.", 'start': 2716.937, 'duration': 4.203}, {'end': 2726.504, 'text': 'Memory accesses are catastrophically slow, but we fetch an entire cache line at a time.', 'start': 2721.34, 'duration': 5.164}, {'end': 2727.925, 'text': "Notice how I'm rephrasing your question.", 'start': 2726.604, 'duration': 1.321}, {'end': 2733.148, 'text': "And when we fetch a cache line, you get all of the adjacent bytes whenever you're achieving one.", 'start': 2728.365, 'duration': 4.783}, {'end': 2738.192, 'text': "So wouldn't it be great if we did a linear probe rather than searching somewhere else?", 'start': 2733.469, 'duration': 4.723}, {'end': 2743.84, 'text': "And that's in fact what I talked about at last year's holiday party, what I did for SETS.", 'start': 2738.857, 'duration': 4.983}], 'summary': 'Bhushan discusses the inefficiency of memory accesses when dealing with big data and suggests using a linear probe for better performance.', 'duration': 26.903, 'max_score': 2716.937, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82716937.jpg'}, {'end': 2809.304, 'src': 'embed', 'start': 2770.461, 'weight': 0, 'content': [{'end': 2778.588, 'text': "I've designed sets for extremely fast membership testing, but they won't iterate as fast as dictionaries, and they won't be as space efficient.", 'start': 2770.461, 'duration': 8.127}, {'end': 2780.991, 'text': "On the other hand, that's not their typical use case.", 'start': 2778.969, 'duration': 2.022}, {'end': 2785.155, 'text': 'But for dictionaries, which dominates all the rest of use of Python.', 'start': 2781.391, 'duration': 3.764}, {'end': 2790.277, 'text': 'Well, there was one ambiguous term that you used, big data and small data.', 'start': 2785.655, 'duration': 4.622}, {'end': 2795.599, 'text': 'My thoughts on the subject, I heard at a PI data conference this notion of medium data.', 'start': 2791.597, 'duration': 4.002}, {'end': 2800.46, 'text': 'And medium data, I think, serves as a great concept to separate big from small.', 'start': 2796.259, 'duration': 4.201}, {'end': 2802.061, 'text': 'What is medium data?', 'start': 2800.94, 'duration': 1.121}, {'end': 2807.803, 'text': 'Medium data is small enough to fit on your computer, but big enough to where, if you use a crummy algorithm,', 'start': 2802.421, 'duration': 5.382}, {'end': 2809.304, 'text': "you're going to really pay a price for it.", 'start': 2807.803, 'duration': 1.501}], 'summary': 'Sets are fast for membership testing but slower than dictionaries; medium data concept separates big from small data in python.', 'duration': 38.843, 'max_score': 2770.461, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82770461.jpg'}, {'end': 2936.799, 'src': 'embed', 'start': 2905.616, 'weight': 3, 'content': [{'end': 2909.581, 'text': 'And making all of Python more cache friendly means that you have more cache available.', 'start': 2905.616, 'duration': 3.965}, {'end': 2914.307, 'text': "The interesting thing about trying to optimize for cache is it's shockingly difficult to time.", 'start': 2910.282, 'duration': 4.025}, {'end': 2917.391, 'text': "Because when you go to time it, you're timing the feature of interest.", 'start': 2914.727, 'duration': 2.664}, {'end': 2919.452, 'text': 'which puts it in cache anyway.', 'start': 2917.851, 'duration': 1.601}, {'end': 2921.172, 'text': "So it's hard to observe the effects.", 'start': 2919.592, 'duration': 1.58}, {'end': 2926.815, 'text': 'Cache effects only show up in big programs that are constantly blowing something else out of a hash.', 'start': 2921.373, 'duration': 5.442}, {'end': 2929.496, 'text': "So it's shockingly difficult to get good measurements.", 'start': 2927.235, 'duration': 2.261}, {'end': 2930.416, 'text': 'In fact,', 'start': 2929.656, 'duration': 0.76}, {'end': 2936.799, 'text': "one reason I don't trust a lot of people's measurements is they focus on what happens when everything's in cache and they completely ignore the cache effects.", 'start': 2930.416, 'duration': 6.383}], 'summary': 'Optimizing python for cache is difficult to measure due to cache effects in big programs.', 'duration': 31.183, 'max_score': 2905.616, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82905616.jpg'}, {'end': 2987.405, 'src': 'embed', 'start': 2962.801, 'weight': 4, 'content': [{'end': 2971.903, 'text': 'So I was just wondering how does the cache friendliness of those kind of dictionaries sort of hold up? OK.', 'start': 2962.801, 'duration': 9.102}, {'end': 2975.924, 'text': "So there's a couple parts to what Cindy proposed.", 'start': 2972.503, 'duration': 3.421}, {'end': 2984.304, 'text': "She said one when you're dealing with nested data structures, which were very common in JSON data, Is there any interesting cache effect there?", 'start': 2976.24, 'duration': 8.064}, {'end': 2987.405, 'text': "And how does that interact with all the dictionaries I've shown here?", 'start': 2984.464, 'duration': 2.941}], 'summary': 'Examining cache friendliness in json data and nested dictionaries.', 'duration': 24.604, 'max_score': 2962.801, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82962801.jpg'}, {'end': 3148.198, 'src': 'embed', 'start': 3118.198, 'weight': 5, 'content': [{'end': 3121.74, 'text': "Sometimes when you're optimizing or improving the language, you're making trade-offs.", 'start': 3118.198, 'duration': 3.542}, {'end': 3124.601, 'text': "You make one thing better while you're making something else worse.", 'start': 3121.86, 'duration': 2.741}, {'end': 3132.986, 'text': "And hopefully that thing that you're making worse isn't in common use or doesn't take some particular user who relies on it and makes their life catastrophically bad.", 'start': 3124.901, 'duration': 8.085}, {'end': 3135.547, 'text': 'Dictionaries are kind of a special case though.', 'start': 3133.626, 'duration': 1.921}, {'end': 3138.469, 'text': 'Everything in the Python world uses dictionaries.', 'start': 3136.107, 'duration': 2.362}, {'end': 3141.67, 'text': 'If you improve dictionaries anywhere, you improve all of Python.', 'start': 3138.689, 'duration': 2.981}, {'end': 3148.198, 'text': "And so, I don't know of any special cases where we have made anyone worse.", 'start': 3142.091, 'duration': 6.107}], 'summary': 'Improving dictionaries in python improves the entire language without making anyone worse off.', 'duration': 30, 'max_score': 3118.198, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG83118198.jpg'}], 'start': 2657.931, 'title': 'Linear probing and cache line optimization', 'summary': 'Discusses linear probing in big and small data, highlighting trade-offs and impact on cache performance, and explores cache line optimization in python, discussing efficiency impact, interactions with nested data structures, and trade-offs in optimizing dictionaries for average users.', 'chapters': [{'end': 2829.032, 'start': 2657.931, 'title': 'Linear probing in big and small data', 'summary': 'Discusses the concept of linear probing in big and small data, highlighting the trade-offs between linear probing and hash creation, the impact of data size on cache performance, and the differentiation between big, small, and medium data.', 'duration': 171.101, 'highlights': ['Medium data serves as a concept to separate big from small, being small enough to fit on a computer but large enough to suffer from inefficient algorithms. Medium data concept separates big from small, being small enough to fit on a computer but large enough to suffer from inefficient algorithms.', 'The impact of data size on cache performance is discussed, emphasizing the trade-off between linear probing and hash creation as data grows in size. Discusses the impact of data size on cache performance, emphasizing the trade-off between linear probing and hash creation as data grows in size.', 'Sets is designed for extremely fast membership testing but trades off iteration speed and space efficiency compared to dictionaries. Sets is designed for extremely fast membership testing but trades off iteration speed and space efficiency compared to dictionaries.']}, {'end': 3363.478, 'start': 2832.296, 'title': 'Cache line optimization in python', 'summary': 'Explores the cache line alignment in python, discussing the impact on cache efficiency, the interaction with nested data structures like json, and the trade-offs in optimizing dictionaries for average users, backed by examples and observations.', 'duration': 531.182, 'highlights': ['Cache line alignment and its impact on cache efficiency in Python The discussion on cache line alignment in Python and its impact on cache efficiency is highlighted as an important aspect to consider for optimizing performance.', 'Interaction of nested data structures like JSON with cache friendliness in Python The interaction of nested data structures like JSON with cache friendliness in Python is emphasized as a key point, shedding light on its impact on memory utilization and performance.', 'Trade-offs in optimizing dictionaries for average users in Python The trade-offs in optimizing dictionaries for average users in Python are highlighted, emphasizing the importance of considering trade-offs and potential impacts on the user experience.']}], 'duration': 705.547, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG82657931.jpg', 'highlights': ['Medium data concept separates big from small, being small enough to fit on a computer but large enough to suffer from inefficient algorithms.', 'The impact of data size on cache performance is discussed, emphasizing the trade-off between linear probing and hash creation as data grows in size.', 'Sets is designed for extremely fast membership testing but trades off iteration speed and space efficiency compared to dictionaries.', 'Cache line alignment and its impact on cache efficiency in Python is highlighted as an important aspect to consider for optimizing performance.', 'Interaction of nested data structures like JSON with cache friendliness in Python is emphasized as a key point, shedding light on its impact on memory utilization and performance.', 'Trade-offs in optimizing dictionaries for average users in Python are highlighted, emphasizing the importance of considering trade-offs and potential impacts on the user experience.']}, {'end': 4051.08, 'segs': [{'end': 3425.369, 'src': 'embed', 'start': 3363.478, 'weight': 0, 'content': [{'end': 3371.94, 'text': 'but I was recently seeing a GitHub project where you run your Python code, it profiles and then suggests that you recompile Python with GCC flags.', 'start': 3363.478, 'duration': 8.462}, {'end': 3375.221, 'text': "so it's like optimizing your Python for your particular problem set.", 'start': 3371.94, 'duration': 3.281}, {'end': 3378.622, 'text': 'Does that ring a bell for anybody? Anybody else? Ah, all right.', 'start': 3375.241, 'duration': 3.381}, {'end': 3382.843, 'text': "I'm guessing that is a form of PGO, performance guided optimization.", 'start': 3379.102, 'duration': 3.741}, {'end': 3387.82, 'text': "The idea is You compile a Python and the compiler doesn't know how Python is going to be used.", 'start': 3382.903, 'duration': 4.917}, {'end': 3392.001, 'text': "And unlike Linux, we don't fill it with a lot of hints on branch prediction,", 'start': 3388.38, 'duration': 3.621}, {'end': 3403.164, 'text': 'in part because some of the feedback from the Linux community was that what were accurate predictions at one time stopped being accurate later and they actually worked to the detriment of the compiler.', 'start': 3392.001, 'duration': 11.163}, {'end': 3405.685, 'text': 'The compilers are just getting really smart at figuring this out.', 'start': 3403.204, 'duration': 2.481}, {'end': 3411.086, 'text': 'But with PGO you compile your Python, run it on your application.', 'start': 3406.065, 'duration': 5.021}, {'end': 3417.926, 'text': 'The PGO learns how your application gets used and recompiles it, which to me is just a slow version of PyPy.', 'start': 3411.903, 'duration': 6.023}, {'end': 3420.907, 'text': 'PyPy does that for you anyway.', 'start': 3418.126, 'duration': 2.781}, {'end': 3423.268, 'text': "That's why we have just-in-time compilers.", 'start': 3421.047, 'duration': 2.221}, {'end': 3425.369, 'text': 'But in fact, it is a great way to go.', 'start': 3423.588, 'duration': 1.781}], 'summary': "Github project suggests recompiling python with gcc flags for performance optimization, similar to pgo and pypy's just-in-time compilation.", 'duration': 61.891, 'max_score': 3363.478, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG83363478.jpg'}, {'end': 3551.717, 'src': 'embed', 'start': 3522.805, 'weight': 5, 'content': [{'end': 3527.149, 'text': '3.0 was slower than Python 2.7.', 'start': 3522.805, 'duration': 4.344}, {'end': 3529.37, 'text': "Memory hog, email module didn't work.", 'start': 3527.149, 'duration': 2.221}, {'end': 3532.253, 'text': "But so few people used it that people didn't realize it.", 'start': 3529.911, 'duration': 2.342}, {'end': 3536.256, 'text': 'So we could go around and say, the number 3 is bigger than the number 2, therefore you should upgrade.', 'start': 3532.273, 'duration': 3.983}, {'end': 3538.458, 'text': 'And no one said any different.', 'start': 3536.336, 'duration': 2.122}, {'end': 3544.903, 'text': 'Now, along the way, the email module got fixed by Python 3.2, and it became usable.', 'start': 3539.338, 'duration': 5.565}, {'end': 3551.717, 'text': "That said, I had some friends of mine starting to use it in production and they're like am I the first person to ever glue these two parts together?", 'start': 3545.393, 'duration': 6.324}], 'summary': 'Python 3.0 was slower than 2.7, but python 3.2 fixed email module issues.', 'duration': 28.912, 'max_score': 3522.805, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG83522805.jpg'}, {'end': 3594.368, 'src': 'embed', 'start': 3561.882, 'weight': 4, 'content': [{'end': 3564.984, 'text': 'But as you started to use it commercially, it started to fall apart.', 'start': 3561.882, 'duration': 3.102}, {'end': 3569.847, 'text': 'So I was not a big fan until basically now.', 'start': 3565.084, 'duration': 4.763}, {'end': 3571.968, 'text': 'Python 3.6 is the first one I love.', 'start': 3570.087, 'duration': 1.881}, {'end': 3579.481, 'text': "I've been working every weekend and night for years to make it good, and it is finally worthy.", 'start': 3572.878, 'duration': 6.603}, {'end': 3588.265, 'text': "So any gotchas along the way?, Which effectively translates to is there anything I don't like about Python 3.6?", 'start': 3580.642, 'duration': 7.623}, {'end': 3594.368, 'text': "It's more just, is there any stumbling blocks to people migrating? To upgrading?", 'start': 3588.265, 'duration': 6.103}], 'summary': 'Python 3.6 is the first version the speaker loves after years of effort.', 'duration': 32.486, 'max_score': 3561.882, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG83561882.jpg'}, {'end': 3733.415, 'src': 'embed', 'start': 3705.825, 'weight': 6, 'content': [{'end': 3708.526, 'text': "I think we're now at a point where everybody just needs to switch.", 'start': 3705.825, 'duration': 2.701}, {'end': 3715.568, 'text': 'Forget the greatest Python ever invented, 2.7.', 'start': 3709.726, 'duration': 5.842}, {'end': 3716.128, 'text': 'Move forward.', 'start': 3715.568, 'duration': 0.56}, {'end': 3718.068, 'text': 'This one is finally worthy of it.', 'start': 3716.208, 'duration': 1.86}, {'end': 3724.65, 'text': "But in fact, I managed to get this across and we've undeprecated a whole bunch of things.", 'start': 3718.709, 'duration': 5.941}, {'end': 3731.314, 'text': 'So Python 3.6 broke hypothesis, for example, by deprecating part of the inspect protocol and we undid that.', 'start': 3724.83, 'duration': 6.484}, {'end': 3733.415, 'text': 'So I think it ought to be fairly effortless.', 'start': 3731.474, 'duration': 1.941}], 'summary': 'Python 3.6 undeprecated features, making switch fairly effortless.', 'duration': 27.59, 'max_score': 3705.825, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG83705825.jpg'}, {'end': 4030.834, 'src': 'embed', 'start': 4002.751, 'weight': 7, 'content': [{'end': 4006.472, 'text': "As a consultant, people are about to ask me really hard questions I don't know the answer to.", 'start': 4002.751, 'duration': 3.721}, {'end': 4013.471, 'text': "As a teacher, I have to ask myself, can I still teach Python in a week? I don't think so.", 'start': 4007.072, 'duration': 6.399}, {'end': 4016.432, 'text': 'It is now a very, very, very big language.', 'start': 4013.751, 'duration': 2.681}, {'end': 4017.992, 'text': "So there's some concerns there.", 'start': 4016.812, 'duration': 1.18}, {'end': 4020.532, 'text': 'That said, it seems to be working out fine for people.', 'start': 4018.292, 'duration': 2.24}, {'end': 4023.513, 'text': 'IEEE ranked it in the top three languages in the world.', 'start': 4020.912, 'duration': 2.601}, {'end': 4030.834, 'text': "It's the number one language taught in universities in the world, number one in high schools and, for the second half of elementary school,", 'start': 4023.773, 'duration': 7.061}], 'summary': "Python's popularity is evident as it's ranked top three by ieee and is the number one language taught in universities and high schools worldwide.", 'duration': 28.083, 'max_score': 4002.751, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG84002751.jpg'}], 'start': 3363.478, 'title': 'Python optimization and python 3.6 evolution', 'summary': 'Discusses performance guided optimization (pgo) in python compilation, its benefits, and limitations compared to just-in-time compilation. it also covers challenges in migrating to python 3.6, improvements, concerns about rapid feature addition, and future aspirations for python 3.7 and 3.8.', 'chapters': [{'end': 3469.237, 'start': 3363.478, 'title': 'Pgo in python optimization', 'summary': 'Discusses the concept of performance guided optimization (pgo) in python compilation, its benefits in making python faster, and its limitations in tuning to specific applications, compared to the just-in-time compilation approach. it also mentions the usage of gcc flags for python optimization.', 'duration': 105.759, 'highlights': ['PGO involves compiling Python, running it on the application to learn its usage, and recompiling it, making Python faster, but tuning it to specific applications and their hotspots.', "The usage of GCC flags for Python optimization, compiler's ability to figure out usage patterns, and the comparison to just-in-time compilation approach.", "The discussion on Linux community's feedback on branch prediction hints and the potential drawbacks of inaccurate predictions for the compiler.", 'The mention of PyPy as a just-in-time compiler and its comparison to the PGO approach in Python optimization.']}, {'end': 4051.08, 'start': 3471.299, 'title': 'Python 3.6 gotchas and future aspirations', 'summary': "Discusses potential challenges in migrating to python 3.6 from previous versions, the improvements and challenges in python 3's evolution, and concerns about the rapid addition of features, with mention of python's ranking and dominance. python 3.6 is praised as the first version that the speaker loves, with noticeable improvements and minimal migration issues, while expressing concerns about the rapid addition of features and the growing complexity of python. the speaker also provides insights into future aspirations for python 3.7 and 3.8, emphasizing the need for a more cautious approach to adding new features.", 'duration': 579.781, 'highlights': ['Python 3.6 is the first one I love. The speaker expresses his appreciation for Python 3.6, highlighting it as the first version he genuinely loves.', "3.0 was slower than Python 2.7. Memory hog, email module didn't work. Comparison of Python 3.0 with Python 2.7, emphasizing its initial performance and functionality issues.", 'Python 3.6 broke hypothesis by deprecating part of the inspect protocol and we undid that. Specific example of Python 3.6 breaking a feature and subsequently reverting the change, demonstrating the consideration for maintaining backward compatibility and stability.', "IEEE ranked it in the top three languages in the world. Recognition of Python's global standing and influence, as it is ranked among the top three languages by IEEE."]}], 'duration': 687.602, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/p33CVV29OG8/pics/p33CVV29OG83363478.jpg', 'highlights': ['PGO involves compiling Python, running it on the application to learn its usage, and recompiling it, making Python faster, but tuning it to specific applications and their hotspots.', "The usage of GCC flags for Python optimization, compiler's ability to figure out usage patterns, and the comparison to just-in-time compilation approach.", "The discussion on Linux community's feedback on branch prediction hints and the potential drawbacks of inaccurate predictions for the compiler.", 'The mention of PyPy as a just-in-time compiler and its comparison to the PGO approach in Python optimization.', 'Python 3.6 is the first one I love. The speaker expresses his appreciation for Python 3.6, highlighting it as the first version he genuinely loves.', "3.0 was slower than Python 2.7. Memory hog, email module didn't work. Comparison of Python 3.0 with Python 2.7, emphasizing its initial performance and functionality issues.", 'Python 3.6 broke hypothesis by deprecating part of the inspect protocol and we undid that. Specific example of Python 3.6 breaking a feature and subsequently reverting the change, demonstrating the consideration for maintaining backward compatibility and stability.', "IEEE ranked it in the top three languages in the world. Recognition of Python's global standing and influence, as it is ranked among the top three languages by IEEE."]}], 'highlights': ['Python 3.6 reduces dictionary size through compaction, resulting in dictionaries being less than half their original size, demonstrating impressive memory savings and efficiency.', 'The speaker has personally trained over 5,000 Python programmers, and their team has cumulatively trained over 10,000 in week-long training sessions.', 'The resizing optimization in Python 3.6 results in a near-free cost of resizing as it eliminates the need to move almost any data and only requires zeroing out the byte index table, making resizing basically free.', 'The art of computer programming, Donald Knuth, Algorithm D, has been in Python for 26 years, emphasizing the longevity and integration of hash tables in programming.', "The technique of increasing the number of buckets and using separate chaining dramatically speeds up search, with Rachel's search improving by 300%.", 'The chapter introduces the concept of shared and compact tables, demonstrating how to insert positions and append entries into a single table, resulting in a concise and space-efficient representation of data.', 'The faster iteration in Python 3.6 dictionaries is achieved by reordering the keys, resulting in consecutive pointers and no branch mispredictions, allowing iteration as fast as a list.', 'The significance of dictionaries in Python is highlighted, with the speaker describing them as pervasive and essential throughout the language, forming the backbone of its functionality.', "The collaborative nature of software development is emphasized through the author's interaction with other developers, such as Maciek and Armin, highlighting the value of seeking feedback and input from peers in the development process.", 'The development of compact dict and key sharing in Python 3.6 is emphasized, showcasing the advancements in technology and the potential impact on programming efficiency.']}