title
Stanford Seminar - Programming Tools for the Future of Data Science, Sarah Chasins
description
Sarah Chasins is an Assistant Professor at University of California, Berkeley.
This talk was given on January 21, 2022.
In the future, anyone will be able to write programs that are currently the exclusive domain of advanced programmers. For now, there's still a big gap between the programming skills of occasional programmers - social scientists, journalists, data scientists - and the skills required to write the programs they want. However, the need is pressing; while there are about 20 million programmers in the world, there are now at least twice as many end users writing code to work with data. In this talk, I'll describe Helena, an ecosystem of programming languages and programming tools that I have used to study how we can support social scientists programming needs. Non-programmers use Helena to collect datasets from the web and, more broadly, to develop custom web automation programs. It brings together the following key innovations: (i) The Helena programming environment uses Programming by Demonstration (PBD); it takes a single-shot learning approach, synthesizing scripts based on recording a single user demonstration. (ii) Helena's adaptive replayer makes scripts robust to webpage redesigns and obfuscation, which enables longitudinal experiments. (iii) With novel language constructs, non-coders can conduct programming tasks usually limited to expert programmers - e.g., failure recovery, parallelization.
Building Helena demanded novel insights into the web automation domain, but it also required a new design approach, a tightly coupled union of techniques from Programming Languages (PL) and Human-Computer Interaction (HCI). I'll connect this work to a discussion about how my lab is bringing together techniques from PL and HCI and why the PL-HCI combination is so powerful for democratizing computation.
Learn more about Stanford's Human-Computer Interaction Group: https://hci.stanford.edu
Learn about Stanford's Graduate Certificate in HCI: https://online.stanford.edu/programs/human-computer-interaction-graduate-certificate
View the full playlist of Stanford Seminars here: https://www.youtube.com/playlist?list=PLoROMvodv4rMyupDF2O00r19JsmolyXdD&disable_polymer=true
#datascience
0:00 Introduction
1:17 Bridge the gap
2:15 My background
2:47 Agenda
3:19 Framing
8:28 Program Synthesis
9:03 Pop quiz
9:36 Pop quiz 2
10:42 How to get to a better position
11:42 What we will talk about
12:03 How many people ave written a web scraper
13:09 Housing voucher programs
14:45 End user web automation
15:52 Web automation programming
16:43 Why is this so hard
22:01 Web automation demo
detail
{'title': 'Stanford Seminar - Programming Tools for the Future of Data Science, Sarah Chasins', 'heatmap': [{'end': 1015.121, 'start': 972.989, 'weight': 0.739}, {'end': 1431.145, 'start': 1310.269, 'weight': 0.901}, {'end': 3006.031, 'start': 2967.177, 'weight': 1}], 'summary': 'The seminar discusses programming tools for data science, inclusive programming and hci integration, challenges in programming language research, real-time data and web automation, automating data collection from google scholar, program and web data synthesis challenges, and implementing user-friendly programming tools, emphasizing the need for collaboration and user studies.', 'chapters': [{'end': 50.141, 'segs': [{'end': 50.141, 'src': 'embed', 'start': 11.129, 'weight': 0, 'content': [{'end': 19.335, 'text': "So I'm going to talk today about a number of things around the general space of what programming tools do people doing data science need.", 'start': 11.129, 'duration': 8.206}, {'end': 24.438, 'text': "And as mentioned, I'm going to focus especially on the needs of domain experts from non-technical domains.", 'start': 19.895, 'duration': 4.543}, {'end': 35.167, 'text': "I'm also going to be talking about what it has looked like to do work in sort of the program synthesis and program languages space in the past,", 'start': 25.119, 'duration': 10.048}, {'end': 40.772, 'text': 'aiming to work for end users and maybe not always landing on something that is usable or useful.', 'start': 35.167, 'duration': 5.605}, {'end': 50.141, 'text': "So I'm going to talk a bit in that context about need finding to make things that are going to be useful and then formative assessment to actually figure out if things are usable.", 'start': 41.292, 'duration': 8.849}], 'summary': 'Focus on programming tools for data science needs of non-technical domain experts, addressing usability and usefulness.', 'duration': 39.012, 'max_score': 11.129, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk11129.jpg'}], 'start': 11.129, 'title': 'Programming tools for data science', 'summary': 'Discusses the programming tools required for data science, emphasizing the needs of domain experts and the importance of creating usable solutions through need finding and formative assessment.', 'chapters': [{'end': 50.141, 'start': 11.129, 'title': 'Programming tools for data science domain experts', 'summary': 'Discusses the programming tools needed for data science, with a focus on the requirements of domain experts from non-technical fields, and emphasizes the importance of creating usable and useful solutions through need finding and formative assessment.', 'duration': 39.012, 'highlights': ['The importance of programming tools for data science, especially for domain experts from non-technical domains, is emphasized.', 'The focus is on creating usable and useful solutions through need finding and formative assessment.', 'The discussion includes the challenges in the program synthesis and program languages space when aiming to work for end users.']}], 'duration': 39.012, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk11129.jpg', 'highlights': ['Emphasizes the importance of programming tools for data science, especially for domain experts from non-technical domains.', 'Focuses on creating usable and useful solutions through need finding and formative assessment.', 'Discusses challenges in the program synthesis and program languages space when aiming to work for end users.']}, {'end': 431.076, 'segs': [{'end': 98.334, 'src': 'embed', 'start': 51.371, 'weight': 0, 'content': [{'end': 58.154, 'text': 'So I want to start by acknowledging that this talk is going to touch on a bunch of awesome work by my students, the folks in my lab.', 'start': 51.371, 'duration': 6.783}, {'end': 64.217, 'text': "I'm not going to go through all of them here, but I will pop up the pictures again when we get to their work.", 'start': 59.015, 'duration': 5.202}, {'end': 71.601, 'text': 'We have been calling ourselves the Plate Lab for basically talking about approachable and inclusive programming languages and tools.', 'start': 65.197, 'duration': 6.404}, {'end': 74.523, 'text': 'And these are some of the amazing students.', 'start': 72.842, 'duration': 1.681}, {'end': 76.704, 'text': 'All the good stuff is from them.', 'start': 75.703, 'duration': 1.001}, {'end': 81.618, 'text': 'I also want to talk about one of the primary goals of this lab,', 'start': 78.035, 'duration': 3.583}, {'end': 98.334, 'text': "which is to bridge the gap with folks from the social sciences who they have various needs but they don't necessarily know how to do the traditional programming processes that we would expect from folks to get their automation needs accomplished.", 'start': 81.618, 'duration': 16.716}], 'summary': 'Plate lab focuses on inclusive programming, aims to bridge gap with social sciences.', 'duration': 46.963, 'max_score': 51.371, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk51371.jpg'}, {'end': 153.285, 'src': 'embed', 'start': 119.325, 'weight': 2, 'content': [{'end': 120.926, 'text': 'And so building really strong,', 'start': 119.325, 'duration': 1.601}, {'end': 131.53, 'text': 'close collaborations with folks on the social science side has been really critical for our lab in actually getting tools adopted by them and making sure that they are actually useful and usable for them.', 'start': 120.926, 'duration': 10.604}, {'end': 134.871, 'text': "So that's going to be a thread that runs through today's talk.", 'start': 132.529, 'duration': 2.342}, {'end': 139.394, 'text': "A little quick background on me and sort of what has shaped my lab's direction.", 'start': 135.892, 'duration': 3.502}, {'end': 146.62, 'text': 'I started from a primarily programming languages background, both at sort of the undergrad level and for the first part of my grad school,', 'start': 139.714, 'duration': 6.906}, {'end': 153.285, 'text': 'but eventually realized that I really was going to be needing a bunch of HCI techniques in order to do the work I wanted to do well.', 'start': 146.62, 'duration': 6.665}], 'summary': 'Building strong collaborations with social science side critical for tool adoption and usability.', 'duration': 33.96, 'max_score': 119.325, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk119325.jpg'}, {'end': 194.848, 'src': 'embed', 'start': 171.684, 'weight': 4, 'content': [{'end': 178.585, 'text': "So first, I mentioned that we've been thinking a lot in my lab about how PL should be integrating techniques from HCI.", 'start': 171.684, 'duration': 6.901}, {'end': 181.906, 'text': "I'm then going to do a somewhat deep dive on Helena,", 'start': 179.125, 'duration': 2.781}, {'end': 187.087, 'text': "which is a particular tool that we've developed that's focused on basically being usable and useful for social scientists.", 'start': 181.906, 'duration': 5.181}, {'end': 194.848, 'text': "And I'm going to talk about how that tool and various other tools in the general space have actually shaped our approach for integrating HCI into PL.", 'start': 187.727, 'duration': 7.121}], 'summary': 'Lab integrating hci techniques into pl, focus on helena tool for social scientists.', 'duration': 23.164, 'max_score': 171.684, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk171684.jpg'}], 'start': 51.371, 'title': 'Inclusive programming and hci integration', 'summary': "Discusses the plate lab's goal of creating inclusive programming languages and tools to bridge the gap with social science fields, enabling them to adopt new techniques and automation processes, and emphasizes the integration of human-computer interaction (hci) techniques into programming languages (pl), highlighting the need for collaboration and adoption of hci techniques to create effective programming tools.", 'chapters': [{'end': 119.225, 'start': 51.371, 'title': 'Inclusive programming for social sciences', 'summary': "Discusses the plate lab's goal of creating inclusive programming languages and tools to bridge the gap with social science fields, enabling them to adopt new techniques and automation processes.", 'duration': 67.854, 'highlights': ['The Plate Lab focuses on creating approachable and inclusive programming languages and tools to bridge the gap with social sciences.', "The lab's primary goal is to enable social science visionaries to try new techniques and automation processes.", 'The lab aims to support folks from the social sciences who have various needs but may not know traditional programming processes.']}, {'end': 431.076, 'start': 119.325, 'title': 'Integrating hci techniques into programming languages', 'summary': 'Discusses the integration of techniques from human-computer interaction (hci) into programming languages (pl), emphasizing the need for collaboration and the adoption of hci techniques to create effective programming tools.', 'duration': 311.751, 'highlights': ['Building strong collaborations with social scientists is critical for getting tools adopted and ensuring their usefulness, as emphasized throughout the talk. The speaker highlights the importance of close collaborations with social scientists to ensure the adoption and usefulness of tools, emphasizing its critical nature for the lab.', "The speaker's transition from a primarily programming languages background to realizing the need for Human-Computer Interaction (HCI) techniques in their work. The speaker shares their transition from a programming languages background to recognizing the necessity of HCI techniques in their work, reflecting on their experience and the evolution of their lab's direction.", "Discussing the integration of HCI techniques into programming languages, particularly focusing on the development of the tool 'Helena' aimed at social scientists. The talk delves into the integration of HCI techniques into programming languages, with a specific focus on the tool 'Helena' designed to be usable and useful for social scientists."]}], 'duration': 379.705, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk51371.jpg', 'highlights': ['The Plate Lab focuses on creating approachable and inclusive programming languages and tools to bridge the gap with social sciences.', "The lab's primary goal is to enable social science visionaries to try new techniques and automation processes.", 'Building strong collaborations with social scientists is critical for getting tools adopted and ensuring their usefulness, as emphasized throughout the talk.', "The speaker's transition from a primarily programming languages background to realizing the need for Human-Computer Interaction (HCI) techniques in their work.", "Discussing the integration of HCI techniques into programming languages, particularly focusing on the development of the tool 'Helena' aimed at social scientists."]}, {'end': 776.056, 'segs': [{'end': 461.638, 'src': 'embed', 'start': 431.576, 'weight': 0, 'content': [{'end': 438.778, 'text': "And we're really excited about people starting to basically go out and look for the kinds of things that programming languages and tools can tackle out in the world.", 'start': 431.576, 'duration': 7.202}, {'end': 446.414, 'text': 'Another thing is we figured out that not every programming languages researcher is willing to actually run formative user studies.', 'start': 440.512, 'duration': 5.902}, {'end': 454.456, 'text': 'And so, if we want to make it practical for these folks to actually make good design choices in the process of developing languages,', 'start': 446.954, 'duration': 7.502}, {'end': 458.537, 'text': 'of developing tools, we really are going to have to give them some kind of a shortcut.', 'start': 454.456, 'duration': 4.081}, {'end': 461.638, 'text': "And we think that's probably going to end up having to be behavioral theory,", 'start': 458.577, 'duration': 3.061}], 'summary': 'Researchers need shortcuts for user studies to improve language and tool design.', 'duration': 30.062, 'max_score': 431.576, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk431576.jpg'}, {'end': 692.02, 'src': 'embed', 'start': 667.3, 'weight': 3, 'content': [{'end': 672.725, 'text': "We have learned all of these things about what makes programs synthesis tools that actually won't get adopted,", 'start': 667.3, 'duration': 5.425}, {'end': 678.369, 'text': "that users don't actually particularly like and, unfortunately, not all that much about what they actually do like.", 'start': 672.725, 'duration': 5.644}, {'end': 683.453, 'text': "I'm going to provide a quick call out to some work by these two students that is specifically studying.", 'start': 679.07, 'duration': 4.383}, {'end': 692.02, 'text': 'OK, if we put a program synthesizer in front of a programming novice maybe someone from CS1, what are they going to actually be able to do with it?', 'start': 683.453, 'duration': 8.567}], 'summary': 'Research highlights challenges in program synthesis adoption and user preferences, lacking insights on user preferences.', 'duration': 24.72, 'max_score': 667.3, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk667300.jpg'}, {'end': 776.056, 'src': 'embed', 'start': 747.876, 'weight': 5, 'content': [{'end': 754.04, 'text': "But the trend that's actually most interesting to me is how the breakdown of who's in those pies is actually changing right?", 'start': 747.876, 'duration': 6.164}, {'end': 755.221, 'text': 'So right now,', 'start': 754.18, 'duration': 1.041}, {'end': 760.405, 'text': 'a lot of the people who are actually getting web data and using it for things are folks who have the programming skills of the people in this room.', 'start': 755.221, 'duration': 5.184}, {'end': 766.549, 'text': "They can sit down and actually write a web automation script with one of the web automation libraries that's out there.", 'start': 761.025, 'duration': 5.524}, {'end': 776.056, 'text': 'But increasingly folks from outside of a traditional computer science background are realizing how much value there is and how much you can actually learn from real-time data,', 'start': 767.27, 'duration': 8.786}], 'summary': 'More non-programmers are using web data for real-time insights.', 'duration': 28.18, 'max_score': 747.876, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk747876.jpg'}], 'start': 431.576, 'title': 'Challenges in programming language research and programming by example', 'summary': 'Discusses challenges in integrating theory and refinement in programming language research, emphasizing the need for practical shortcuts and increased user studies. it also addresses the current state and challenges of program synthesis, highlighting the need for accessible tools and the lack of adoption and user feedback, with insights from user studies and web data usage evolution.', 'chapters': [{'end': 497.986, 'start': 431.576, 'title': 'Challenges in programming language research', 'summary': 'Discusses the challenges of integrating behavioral theory and iterative refinement in programming language research, emphasizing the need for practical shortcuts and increased formative user studies to make good design choices.', 'duration': 66.41, 'highlights': ['The need for practical shortcuts such as behavioral theory in order to enable programming language researchers to make good design choices without running user studies.', 'The importance of integrating iterative refinement in programming language research to ensure formative input and iterate on design choices.', 'The lack of formative user studies and iterative refinement in mainstream programming language research, highlighting the need for more emphasis on these practices.']}, {'end': 776.056, 'start': 499.133, 'title': 'Programming by example and program synthesis', 'summary': 'Discusses the current state and challenges of program synthesis, highlighting the need for tools to make tasks easier and accessible to non-programmers, while emphasizing the lack of adoption and user feedback, with insights from user studies and the evolution of web data usage.', 'duration': 276.923, 'highlights': ['The need for program synthesis tools to make tasks easier and accessible to non-programmers is emphasized, with a lack of adoption and user feedback highlighted.', 'Insights from user studies and the evolution of web data usage are discussed in the context of program synthesis challenges and user adoption.', 'The breakdown of people using web data is changing, with increasing interest and value realization from non-traditional computer science backgrounds.']}], 'duration': 344.48, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk431576.jpg', 'highlights': ['The need for practical shortcuts such as behavioral theory to enable good design choices without user studies.', 'Integrating iterative refinement in programming language research is crucial for formative input and design iteration.', 'Mainstream programming language research lacks formative user studies and iterative refinement, requiring more emphasis on these practices.', 'Emphasizing the need for program synthesis tools to be accessible to non-programmers, with highlighted lack of adoption and user feedback.', 'Insights from user studies and web data usage evolution provide context for program synthesis challenges and user adoption.', 'Changing breakdown of web data users, with increasing interest and value realization from non-traditional backgrounds.']}, {'end': 1328.056, 'segs': [{'end': 870.931, 'src': 'embed', 'start': 841.591, 'weight': 0, 'content': [{'end': 846.856, 'text': 'Pretty quickly you end up with a point where the rent threshold is set in such a way that only the low income,', 'start': 841.591, 'duration': 5.265}, {'end': 850.62, 'text': 'low opportunity neighborhoods are actually accessible below that rent threshold.', 'start': 846.856, 'duration': 3.764}, {'end': 855.925, 'text': 'And then you have unintentionally produced a housing voucher program that is funneling people into those neighborhoods only.', 'start': 851.06, 'duration': 4.865}, {'end': 857.646, 'text': "That's not what we want.", 'start': 857.006, 'duration': 0.64}, {'end': 864.309, 'text': 'And what the folks in the space realized was that the way they can actually set those thresholds in a data-guided way,', 'start': 858.206, 'duration': 6.103}, {'end': 870.931, 'text': "in a way that's actually going to adapt to fast-changing market situations, is to get real-time data from the web.", 'start': 864.309, 'duration': 6.622}], 'summary': 'Housing voucher program funnels people into low-income neighborhoods, prompting need for data-guided rent threshold adjustments.', 'duration': 29.34, 'max_score': 841.591, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk841591.jpg'}, {'end': 979.091, 'src': 'embed', 'start': 954.604, 'weight': 1, 'content': [{'end': 961.066, 'text': 'the reason all these folks were reaching out to us and happy to get involved with the project is because web automation programming is really hard.', 'start': 954.604, 'duration': 6.462}, {'end': 968.128, 'text': 'So this is data on basically folks with computer science PhDs, well, folks in computer science PhD programs.', 'start': 961.706, 'duration': 6.422}, {'end': 972.389, 'text': 'How quickly were they able to actually accomplish a particular web automation programming task?', 'start': 968.708, 'duration': 3.681}, {'end': 979.091, 'text': 'And it turns out that, even if you let them go for an hour, only about a quarter of them were able to actually complete the task.', 'start': 972.989, 'duration': 6.102}], 'summary': 'Only 25% of computer science phds completed web automation in an hour.', 'duration': 24.487, 'max_score': 954.604, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk954604.jpg'}, {'end': 1015.121, 'src': 'heatmap', 'start': 972.989, 'weight': 0.739, 'content': [{'end': 979.091, 'text': 'And it turns out that, even if you let them go for an hour, only about a quarter of them were able to actually complete the task.', 'start': 972.989, 'duration': 6.102}, {'end': 986.307, 'text': 'On the other hand, if you give them our tool, they can go ahead and get to 100% completion rate within 10 minutes.', 'start': 980.123, 'duration': 6.184}, {'end': 990.29, 'text': "And in fact, this isn't just let me do this task in this time.", 'start': 986.508, 'duration': 3.782}, {'end': 996.655, 'text': 'This is all the way from I have never heard of the tool Helena before all the way to I have completed the task.', 'start': 990.731, 'duration': 5.924}, {'end': 1002.579, 'text': 'So this is everything from watching the five minute demo video all the way to actually doing the task.', 'start': 997.375, 'duration': 5.204}, {'end': 1009.355, 'text': "OK, so let's quickly go through why this is so hard, right?", 'start': 1004.669, 'duration': 4.686}, {'end': 1010.676, 'text': 'Why is this actually such a challenge?', 'start': 1009.415, 'duration': 1.261}, {'end': 1015.121, 'text': 'So does anyone have a sense of why this snippet is not going to work?', 'start': 1010.876, 'duration': 4.245}], 'summary': 'Using our tool, 100% completion rate within 10 minutes, from demo to task.', 'duration': 42.132, 'max_score': 972.989, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk972989.jpg'}, {'end': 1283.306, 'src': 'embed', 'start': 1241.559, 'weight': 2, 'content': [{'end': 1246.925, 'text': "And this would be the core idea behind programming by demonstration, which is the basis of the tool that we're going to present today.", 'start': 1241.559, 'duration': 5.366}, {'end': 1251.215, 'text': 'So PVD web automation is a longstanding dream.', 'start': 1248.413, 'duration': 2.802}, {'end': 1260.5, 'text': 'People thought about this general idea ever since it made sense to talk about Netscape Navigator and new versions of Windows 95.', 'start': 1251.515, 'duration': 8.985}, {'end': 1262.08, 'text': 'This is something that people have been thinking about.', 'start': 1260.5, 'duration': 1.58}, {'end': 1268.444, 'text': 'And the first thing that they thought was, OK, maybe people are going to want to extract tables from web pages.', 'start': 1263.021, 'duration': 5.423}, {'end': 1269.404, 'text': "I'm looking at this web page.", 'start': 1268.484, 'duration': 0.92}, {'end': 1270.365, 'text': "It looks like there's a table.", 'start': 1269.424, 'duration': 0.941}, {'end': 1271.746, 'text': 'Maybe we want to grab that out.', 'start': 1270.845, 'duration': 0.901}, {'end': 1274.242, 'text': 'The next thing people said was OK,', 'start': 1272.881, 'duration': 1.361}, {'end': 1283.306, 'text': 'maybe we actually want to record a set of interactions that the user is doing and actually replay them for things like automating tests of web pages,', 'start': 1274.242, 'duration': 9.064}], 'summary': 'Programming by demonstration tool for web automation presented, involving table extraction and interaction recording.', 'duration': 41.747, 'max_score': 1241.559, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1241559.jpg'}], 'start': 776.056, 'title': 'Real-time data and web automation', 'summary': 'Discusses the importance of real-time data for low-income families in housing voucher programs, emphasizing the need for web automation programming and the challenges it addresses, such as extracting dynamic content and server communication, along with the success rate of a web automation tool.', 'chapters': [{'end': 1015.121, 'start': 776.056, 'title': 'Real-time data for low-income families', 'summary': 'Discusses the importance of real-time data for low-income families to move to high-opportunity neighborhoods through housing voucher programs, and the challenges of setting rent thresholds using outdated census data, highlighting the need for web automation programming and the success rate of a tool in completing web automation tasks.', 'duration': 239.065, 'highlights': ['The challenges of setting rent thresholds using outdated census data and the unintended consequences of funneling low-income families into low-opportunity neighborhoods are discussed, emphasizing the need for real-time data to guide housing voucher programs.', 'The chapter highlights the real need for end user web automation, as evidenced by the flood of responses from social science domains when inquiring about the need for web data, leading to long-term collaborations and the difficulty of web automation programming for individuals without computer science backgrounds.', 'The success rate of a web automation tool in completing tasks, with 100% completion rate within 10 minutes for computer science PhD students compared to only a quarter completing the task within an hour without the tool, demonstrating the effectiveness of the tool in simplifying web automation programming for users.']}, {'end': 1328.056, 'start': 1017.204, 'title': 'Challenges of web automation', 'summary': 'Discusses the challenges of web automation, including the difficulty of extracting dynamic content, reverse engineering page structure, and server communication, leading to the development of programming by demonstration as a solution for web automation. it also highlights the historical context of web automation and the stalling of progress due to the increasing complexity of interactive web content.', 'duration': 310.852, 'highlights': ['Developing a solution for web automation involved addressing challenges such as extracting dynamic content, reverse engineering page structure, and understanding server communication, which led to the concept of programming by demonstration (PVD) as a solution for web automation.', 'The historical context of web automation dates back to the early days of Netscape Navigator and Windows 95, with initial ideas focusing on extracting tables from web pages and recording/replaying user interactions for automating tests of web pages.', 'The increasing complexity of interactive web content around 2009 posed significant challenges to existing web automation tools, leading to a stalling of progress in web automation due to the tools breaking down and becoming insufficient for handling the new web landscape.', 'The difficulty of determining the price of an item on a web page due to hardcoded prices within the code, dynamic content, and the need to reverse engineer the page structure and JavaScript code for accurate automation.']}], 'duration': 552, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk776056.jpg', 'highlights': ['The challenges of setting rent thresholds using outdated census data and the unintended consequences of funneling low-income families into low-opportunity neighborhoods are discussed, emphasizing the need for real-time data to guide housing voucher programs.', 'The success rate of a web automation tool in completing tasks, with 100% completion rate within 10 minutes for computer science PhD students compared to only a quarter completing the task within an hour without the tool, demonstrating the effectiveness of the tool in simplifying web automation programming for users.', 'Developing a solution for web automation involved addressing challenges such as extracting dynamic content, reverse engineering page structure, and understanding server communication, which led to the concept of programming by demonstration (PVD) as a solution for web automation.', 'The historical context of web automation dates back to the early days of Netscape Navigator and Windows 95, with initial ideas focusing on extracting tables from web pages and recording/replaying user interactions for automating tests of web pages.']}, {'end': 1904.588, 'segs': [{'end': 1355.796, 'src': 'embed', 'start': 1328.456, 'weight': 0, 'content': [{'end': 1331.457, 'text': 'So what data should we collect if we want to learn when CS researchers peak??', 'start': 1328.456, 'duration': 3.001}, {'end': 1333.659, 'text': 'Where would you all look?', 'start': 1332.978, 'duration': 0.681}, {'end': 1335.301, 'text': 'Would you look at a particular web page?', 'start': 1333.939, 'duration': 1.362}, {'end': 1338.884, 'text': 'Yeah, I would definitely look at Google Scholar.', 'start': 1336.902, 'duration': 1.982}, {'end': 1339.906, 'text': "So let's start there.", 'start': 1338.925, 'duration': 0.981}, {'end': 1343.95, 'text': "So what we're going to do is we're going to go ahead and open a browser extension.", 'start': 1340.486, 'duration': 3.464}, {'end': 1345.211, 'text': "That's our little tool up there.", 'start': 1343.99, 'duration': 1.221}, {'end': 1347.053, 'text': 'I guess you can see the little text that says what it is.', 'start': 1345.231, 'duration': 1.822}, {'end': 1351.675, 'text': "And then over here on the left, we're going to have sort of the control pane.", 'start': 1348.154, 'duration': 3.521}, {'end': 1355.796, 'text': "And on the right, we're going to be interacting with something that is basically just our normal browser window.", 'start': 1351.875, 'duration': 3.921}], 'summary': "Collect data on cs researchers' peak using google scholar and browser extension.", 'duration': 27.34, 'max_score': 1328.456, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1328456.jpg'}, {'end': 1699.358, 'src': 'embed', 'start': 1667.845, 'weight': 1, 'content': [{'end': 1671.327, 'text': "One thing here is we're going to need to make editable programs, calling back to this quote here.", 'start': 1667.845, 'duration': 3.482}, {'end': 1673.008, 'text': 'We really do.', 'start': 1671.407, 'duration': 1.601}, {'end': 1677.31, 'text': 'if we are going to do it this way, we need to make sure that the user can actually read,', 'start': 1673.008, 'duration': 4.302}, {'end': 1680.432, 'text': "understand and modify the programs that they're getting back from the synthesizer.", 'start': 1677.31, 'duration': 3.122}, {'end': 1687.276, 'text': 'So what will make the tool usable? So one thing is it has to be single demonstration, has to be editable.', 'start': 1681.393, 'duration': 5.883}, {'end': 1690.518, 'text': 'We had definitely not reached that point so far in the PVD history.', 'start': 1687.616, 'duration': 2.902}, {'end': 1692.439, 'text': 'So nope, not yet.', 'start': 1691.598, 'duration': 0.841}, {'end': 1699.358, 'text': 'And so our approach here was rather than doing the traditional PPD workflow, where you put in a bunch of demonstrations,', 'start': 1693.871, 'duration': 5.487}], 'summary': 'The tool needs to provide editable programs for user readability and modification, aiming for a single demonstrable and editable approach.', 'duration': 31.513, 'max_score': 1667.845, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1667845.jpg'}, {'end': 1730.646, 'src': 'embed', 'start': 1705.446, 'weight': 2, 'content': [{'end': 1711.674, 'text': "The user debugs by looking at the output and thinks about, is that the right output? If they're not happy with it, they put in another demonstration.", 'start': 1705.446, 'duration': 6.228}, {'end': 1719.439, 'text': 'We get rid of that workflow, and instead we do a demonstrate once, revise anytime workflow, where the idea is the user puts in one demonstration.', 'start': 1712.375, 'duration': 7.064}, {'end': 1722.141, 'text': 'The synthesis algorithm produces a program.', 'start': 1720.06, 'duration': 2.081}, {'end': 1726.623, 'text': 'It may not be exactly the program the user is going to want in the end,', 'start': 1722.481, 'duration': 4.142}, {'end': 1730.646, 'text': "but it's close enough that the user can now edit it until it reaches the point where the user is happy with it.", 'start': 1726.623, 'duration': 4.023}], 'summary': 'Users can demonstrate once, revise anytime, and edit the produced program until satisfied.', 'duration': 25.2, 'max_score': 1705.446, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1705446.jpg'}, {'end': 1769.337, 'src': 'embed', 'start': 1738.914, 'weight': 3, 'content': [{'end': 1742.657, 'text': "So let's talk quickly about why one demo really is so much better than two demonstrations.", 'start': 1738.914, 'duration': 3.743}, {'end': 1746.479, 'text': "So obviously, we've already talked about the fact that it makes for happy, successful users.", 'start': 1743.557, 'duration': 2.922}, {'end': 1750.922, 'text': 'Another thing is it actually changes the constraints on us, the developers of the synthesizer.', 'start': 1747.24, 'duration': 3.682}, {'end': 1756.867, 'text': 'So one of the big challenges in synthesis is if you have to reach every single point in the program space.', 'start': 1751.223, 'duration': 5.644}, {'end': 1759.709, 'text': "that's a really, really massive search space.", 'start': 1756.867, 'duration': 2.842}, {'end': 1761.23, 'text': "And it's really hard to actually do that.", 'start': 1759.889, 'duration': 1.341}, {'end': 1769.337, 'text': 'So we get to relax that because we can just reach some of the points in the program space as long as they are close enough via edits to the program that the user is going to want in the end.', 'start': 1761.41, 'duration': 7.927}], 'summary': 'One demo simplifies synthesis, making it easier for developers and ensuring user satisfaction.', 'duration': 30.423, 'max_score': 1738.914, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1738914.jpg'}], 'start': 1328.456, 'title': 'Automating data collection from google scholar and challenges in program synthesis', 'summary': "Demonstrates using a browser extension to collect data from google scholar, showcasing end user web automation and program synthesis technology's usability. it also discusses the challenges of program synthesis, emphasizing the need for single demonstration usability, editable programs, and the ability to synthesize complex programs, while highlighting the shift to a 'demonstrate once, revise anytime' workflow.", 'chapters': [{'end': 1477.959, 'start': 1328.456, 'title': 'Automating data collection from google scholar', 'summary': "Demonstrates using a browser extension to collect data from google scholar, including information about authors and their papers, and showcases end user web automation and program synthesis technology's usability, while discussing the obstacles faced in programming by demonstration systems.", 'duration': 149.503, 'highlights': ['The chapter demonstrates using a browser extension to collect data from Google Scholar, including information about authors and their papers. The demonstration involves collecting data about authors and their papers, showing how to find information for the first row of the target data set for all tables in the target data set.', "Showcases end user web automation and program synthesis technology's usability. The tool is able to write the program for collecting data, loop over authors and papers, and collect sample data, demonstrating the usability of end user web automation and program synthesis technology.", "Discusses the obstacles faced in programming by demonstration systems. The chapter discusses Tessa Lau's findings that repetitive demonstrations were a significant obstacle for people using programming by demonstration systems, indicating the frustration in giving repetitive demonstrations and the need for a more efficient system."]}, {'end': 1904.588, 'start': 1478.359, 'title': 'Challenges in program synthesis', 'summary': "Discusses the challenges of program synthesis, emphasizing the need for single demonstration usability, editable programs, and the ability to synthesize complex programs. it also highlights the shift to a 'demonstrate once, revise anytime' workflow.", 'duration': 426.229, 'highlights': ['The need for single demonstration usability, editable programs, and the ability to synthesize complex programs. The chapter emphasizes the importance of being able to produce usable, editable programs from a single demonstration, addressing the need for synthesizing complex programs with multi-load data and hierarchical data.', "The shift to a 'demonstrate once, revise anytime' workflow for program synthesis. The chapter discusses the move away from the traditional program-by-demonstration workflow to a 'demonstrate once, revise anytime' approach, allowing users to edit and revise the synthesized program until satisfaction.", 'The challenges in program synthesis and the need to relax constraints for developers. The chapter outlines the challenges in program synthesis, such as the massive search space and infrastructure requirements, and the need to relax constraints for developers to reach usable and editable programs from a single demonstration.']}], 'duration': 576.132, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1328456.jpg', 'highlights': ["The chapter demonstrates using a browser extension to collect data from Google Scholar, showcasing end user web automation and program synthesis technology's usability.", 'The need for single demonstration usability, editable programs, and the ability to synthesize complex programs is emphasized.', "The shift to a 'demonstrate once, revise anytime' workflow for program synthesis is discussed, allowing users to edit and revise the synthesized program until satisfaction.", 'The challenges in program synthesis and the need to relax constraints for developers are outlined.']}, {'end': 2303.426, 'segs': [{'end': 1935.702, 'src': 'embed', 'start': 1905.108, 'weight': 0, 'content': [{'end': 1907.469, 'text': 'We definitely know that the tools that were out there were not there yet.', 'start': 1905.108, 'duration': 2.361}, {'end': 1909.89, 'text': 'But why is it actually hard?', 'start': 1908.69, 'duration': 1.2}, {'end': 1915.132, 'text': "Well, if you are a program synthesizer, you're probably already getting program synthesis person.", 'start': 1909.95, 'duration': 5.182}, {'end': 1918.813, 'text': "you're probably already getting nervous when you realize that you're going to need nested loops right?", 'start': 1915.132, 'duration': 3.681}, {'end': 1923.555, 'text': 'So nested loops are sort of one of the banes of program synthesis in general.', 'start': 1919.393, 'duration': 4.162}, {'end': 1929.818, 'text': "And if you think about it as sort of how it expands the space of programs that you're searching, this makes a lot of sense, right?", 'start': 1924.435, 'duration': 5.383}, {'end': 1935.702, 'text': 'So if you think about only the programs that have no loops, well, that is a massive set, obviously.', 'start': 1930.379, 'duration': 5.323}], 'summary': 'Program synthesis faces challenges due to nested loops, expanding search space.', 'duration': 30.594, 'max_score': 1905.108, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1905108.jpg'}, {'end': 1968.111, 'src': 'embed', 'start': 1940.204, 'weight': 1, 'content': [{'end': 1943.786, 'text': "If you start thinking about all the programs that have one loop, well, you're already up to a much bigger set.", 'start': 1940.204, 'duration': 3.582}, {'end': 1945.447, 'text': "That's starting to look a little nerve wracking.", 'start': 1943.806, 'duration': 1.641}, {'end': 1949.69, 'text': 'And in fact, a lot of programs in the size which cannot handle programs that even have one loop.', 'start': 1945.467, 'duration': 4.223}, {'end': 1954.366, 'text': "If you're allowing multiple loops, boy, now it's really getting intense.", 'start': 1950.725, 'duration': 3.641}, {'end': 1957.467, 'text': "The search space is massive, and it's going to be pretty tough to actually search that space.", 'start': 1954.406, 'duration': 3.061}, {'end': 1964.27, 'text': 'And so we hit this little fork in the road where we could go down the path of trying to make a smarter, faster synthesizer right?', 'start': 1958.548, 'duration': 5.722}, {'end': 1968.111, 'text': 'This is the thing that people have traditionally tried when they have tackled this problem afresh.', 'start': 1964.29, 'duration': 3.821}], 'summary': 'The challenge of synthesizing programs with loops is daunting due to the massive search space, prompting the need for smarter, faster synthesizers.', 'duration': 27.907, 'max_score': 1940.204, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1940204.jpg'}, {'end': 2050.362, 'src': 'embed', 'start': 2024.24, 'weight': 4, 'content': [{'end': 2031.585, 'text': "we really don't have any better approach than just searching through the whole space of programs until we land on one that produces the data that the user has collected.", 'start': 2024.24, 'duration': 7.345}, {'end': 2037.614, 'text': 'On the other hand, if we restrict it to demonstrations like the one on the right, we can be a bit smarter.', 'start': 2032.671, 'duration': 4.943}, {'end': 2044.079, 'text': 'So basically, we can limit the user a little bit in order to make the synthesizer design much easier.', 'start': 2039.196, 'duration': 4.883}, {'end': 2050.362, 'text': "And the idea here is basically entering into a contract with the user to get them to produce something that's synthesizer-friendly.", 'start': 2044.119, 'duration': 6.243}], 'summary': 'Limiting user input can make synthesizer design easier', 'duration': 26.122, 'max_score': 2024.24, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2024240.jpg'}, {'end': 2102.237, 'src': 'embed', 'start': 2077.717, 'weight': 2, 'content': [{'end': 2085.844, 'text': 'And basically, the way that we found to enforce this is just to have a contract with the user that will allow us to make a simpler synthesis process.', 'start': 2077.717, 'duration': 8.127}, {'end': 2089.766, 'text': "So the contract with the user, we wouldn't explain it to them in quite this way, of course.", 'start': 2086.645, 'duration': 3.121}, {'end': 2094.27, 'text': 'But the basics of it is that you want to demonstrate the first iteration of each loop.', 'start': 2090.388, 'duration': 3.882}, {'end': 2099.534, 'text': "Or you can also think of it as the first row of each table that you're going to be using or collecting.", 'start': 2095.17, 'duration': 4.364}, {'end': 2102.237, 'text': 'And you want it to be ordered from outer to inner loop.', 'start': 2100.595, 'duration': 1.642}], 'summary': 'Enforcing contract for simpler synthesis process, demonstrating first iteration of each loop, ordered from outer to inner loop.', 'duration': 24.52, 'max_score': 2077.717, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2077717.jpg'}, {'end': 2288.731, 'src': 'embed', 'start': 2262.099, 'weight': 3, 'content': [{'end': 2266.203, 'text': 'And users do indeed proceed Helena to be more usable than here.', 'start': 2262.099, 'duration': 4.104}, {'end': 2273.15, 'text': 'The comparison is Selenium, which is a traditional web automation library, and also substantially more learnable.', 'start': 2266.263, 'duration': 6.887}, {'end': 2277.034, 'text': 'And also, of course, users are more successful with Helena.', 'start': 2274.732, 'duration': 2.302}, {'end': 2284.882, 'text': 'So this was the big thing was we are most excited for them to be able to do these tasks that they want to do without running into too many obstacles.', 'start': 2277.054, 'duration': 7.828}, {'end': 2288.731, 'text': 'And yeah, I did want to emphasize again that these times include training time.', 'start': 2286.269, 'duration': 2.462}], 'summary': 'Helena is more usable and learnable compared to selenium, leading to increased user success and excitement for task completion without obstacles, including training time.', 'duration': 26.632, 'max_score': 2262.099, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2262099.jpg'}], 'start': 1905.108, 'title': 'Program and web data synthesis challenges and simplification', 'summary': 'Delves into the challenges of program synthesis, emphasizing the complexity and search space expansion in dealing with nested loops, and also explores simplifying web data synthesis by enforcing a user contract, resulting in more manageable and successful outcomes, with helena being found more usable and learnable than traditional web automation libraries.', 'chapters': [{'end': 1954.366, 'start': 1905.108, 'title': 'Challenges of program synthesis', 'summary': 'Discusses the challenges of program synthesis, highlighting the increased complexity and search space expansion when dealing with nested loops, as it progresses from no loops to multiple loops.', 'duration': 49.258, 'highlights': ['Program synthesis encounters challenges when dealing with nested loops, as it exponentially expands the search space, making it increasingly difficult to handle larger sets of programs.', 'Allowing multiple loops in program synthesis intensifies the complexity of the task, posing significant challenges in handling the expanded search space.']}, {'end': 2303.426, 'start': 1954.406, 'title': 'Simplifying web data synthesis', 'summary': 'Discusses the approach of simplifying the web data synthesis process by enforcing a contract with the user, leading to more manageable and successful results, with users finding helena more usable and learnable than traditional web automation libraries.', 'duration': 349.02, 'highlights': ['The approach of enforcing a contract with the user to simplify the web data synthesis process, making it more manageable and successful, with users perceiving Helena as more usable and learnable than traditional web automation libraries.', 'The use of demonstrations and a contract with the user to limit the search space of programs, leading to a more constrained problem of synthesizing table selectors and making the synthesis process much easier and more manageable.', 'The comparison between Helena and Selenium, showing that users find Helena more usable and learnable, leading to a smoother web data synthesis process and successful task completion.']}], 'duration': 398.318, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk1905108.jpg', 'highlights': ['Program synthesis faces challenges with nested loops, exponentially expanding the search space.', 'Handling multiple loops intensifies complexity, posing significant challenges in program synthesis.', 'Enforcing a user contract simplifies web data synthesis, resulting in more manageable and successful outcomes.', 'Users perceive Helena as more usable and learnable than traditional web automation libraries.', 'Enforcing a contract with the user limits the search space of programs, making the synthesis process easier.', 'Comparison shows users find Helena more usable and learnable than Selenium, leading to successful task completion.']}, {'end': 3007.873, 'segs': [{'end': 2396.449, 'src': 'embed', 'start': 2341.742, 'weight': 0, 'content': [{'end': 2344.964, 'text': "That's something that the user might actually be comfortable looking at and editing.", 'start': 2341.742, 'duration': 3.222}, {'end': 2349.627, 'text': 'So how do we actually deal with the fact that we need this low-level language for robust replay?', 'start': 2345.345, 'duration': 4.282}, {'end': 2352.569, 'text': 'But we really want to show the user something more like this.', 'start': 2350.168, 'duration': 2.401}, {'end': 2362.059, 'text': "Who thinks we should sacrifice robustness? Who thinks we should sacrifice readability? Yeah, that's not very satisfying.", 'start': 2353.37, 'duration': 8.689}, {'end': 2363.561, 'text': 'We would much rather just have both.', 'start': 2362.119, 'duration': 1.442}, {'end': 2369.946, 'text': 'And so the way that we actually have both is by showing the users a high-level view where it looks like this,', 'start': 2363.781, 'duration': 6.165}, {'end': 2374.19, 'text': 'but still maintaining the low-level view that we have in order to actually run it.', 'start': 2369.946, 'duration': 4.244}, {'end': 2376.372, 'text': 'And this is this bi-level DSL idea.', 'start': 2374.31, 'duration': 2.062}, {'end': 2382.077, 'text': 'Sorry, DSL is domain-specific language, where what we have is this representation that looks quite friendly.', 'start': 2376.472, 'duration': 5.605}, {'end': 2385.34, 'text': "But it's actually being mapped from a low level language.", 'start': 2382.938, 'duration': 2.402}, {'end': 2389.544, 'text': "We have a reverse compiler that's going ahead and taking it to the version that we want to show users.", 'start': 2385.38, 'duration': 4.164}, {'end': 2392.226, 'text': 'And then, as the user makes changes at the upper level,', 'start': 2390.064, 'duration': 2.162}, {'end': 2396.449, 'text': 'it can go ahead and be propagated back down to the low level code via our custom program editor.', 'start': 2392.226, 'duration': 4.223}], 'summary': 'Balancing robustness and readability in bi-level dsl for user-friendly editing and maintaining low-level language.', 'duration': 54.707, 'max_score': 2341.742, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2341742.jpg'}, {'end': 2602.011, 'src': 'embed', 'start': 2574.528, 'weight': 5, 'content': [{'end': 2581.334, 'text': 'Doing this iterative process is what produced the tool that actually a bunch of people are using quite regularly,', 'start': 2574.528, 'duration': 6.806}, {'end': 2583.756, 'text': 'like the sociology example that I gave earlier.', 'start': 2581.334, 'duration': 2.422}, {'end': 2585.577, 'text': 'They are now, at this point,', 'start': 2584.256, 'duration': 1.321}, {'end': 2596.307, 'text': 'collecting rental data from 100 major metropolitan areas around the US every single day around the clock and using this to shape housing voucher programs.', 'start': 2585.577, 'duration': 10.73}, {'end': 2602.011, 'text': 'So this kind of thing works because we actually did this iterative process with folks to get at.', 'start': 2596.327, 'duration': 5.684}], 'summary': 'Iterative process shaped housing voucher programs using rental data from 100 major metropolitan areas.', 'duration': 27.483, 'max_score': 2574.528, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2574528.jpg'}, {'end': 2839.191, 'src': 'embed', 'start': 2812.984, 'weight': 3, 'content': [{'end': 2817.885, 'text': 'you may see me doing all this work on the social sciences side and the non-technical domain experts side.', 'start': 2812.984, 'duration': 4.901}, {'end': 2820.946, 'text': 'But look, this also applies equally to functional programmers.', 'start': 2818.385, 'duration': 2.561}, {'end': 2827.427, 'text': 'Whatever audience it is that you are trying to build programming tools for, you can use these same approaches to figure out their needs as well.', 'start': 2821.026, 'duration': 6.401}, {'end': 2831.605, 'text': "And then also the fact that it's just quite a diverse array of non-coders.", 'start': 2828.683, 'duration': 2.922}, {'end': 2839.191, 'text': "So we are figuring out that audiences that we wouldn't have even necessarily thought of as having automation challenges absolutely are running into them.", 'start': 2831.685, 'duration': 7.506}], 'summary': 'Diverse audience needs for programming tools, including non-coders, revealed unexpected automation challenges.', 'duration': 26.207, 'max_score': 2812.984, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2812984.jpg'}, {'end': 2896.574, 'src': 'embed', 'start': 2865.877, 'weight': 4, 'content': [{'end': 2868.598, 'text': 'I also wanted to talk a bit about some of the tools that are already getting built up.', 'start': 2865.877, 'duration': 2.721}, {'end': 2875.481, 'text': 'So in particular, Helena here is working on tooling for journalists, and Rebecca is working on tooling for lawyers.', 'start': 2868.958, 'duration': 6.523}, {'end': 2886.286, 'text': 'Both of these tools so far are coming out of a larger project on processing really large, really messy, unstructured data about police misconduct.', 'start': 2876.161, 'duration': 10.125}, {'end': 2891.35, 'text': 'So basically, this is part of a collaboration with folks from the National Association of Criminal Defense Lawyers.', 'start': 2886.986, 'duration': 4.364}, {'end': 2896.574, 'text': "One of the things they're dealing with is the sites that they work with, the defense lawyers that they work with.", 'start': 2891.95, 'duration': 4.624}], 'summary': 'Tools for journalists and lawyers to handle unstructured police misconduct data.', 'duration': 30.697, 'max_score': 2865.877, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2865877.jpg'}, {'end': 3006.031, 'src': 'heatmap', 'start': 2967.177, 'weight': 1, 'content': [{'end': 2970.078, 'text': "But I'm happy to discuss whatever is exciting to this crowd.", 'start': 2967.177, 'duration': 2.901}, {'end': 2981.641, 'text': 'We will have questions.', 'start': 2980.201, 'duration': 1.44}, {'end': 2984.406, 'text': 'and we will also have some questions online.', 'start': 2982.54, 'duration': 1.866}, {'end': 2987.615, 'text': 'Shawn can help us with the questions that come online.', 'start': 2984.747, 'duration': 2.868}, {'end': 2988.437, 'text': 'That sounds great.', 'start': 2987.735, 'duration': 0.702}, {'end': 2988.999, 'text': 'That sounds great.', 'start': 2988.538, 'duration': 0.461}, {'end': 2993.603, 'text': 'Oh, so I think I saw your hand first.', 'start': 2992.082, 'duration': 1.521}, {'end': 2996.105, 'text': 'Do you want to? So great talk.', 'start': 2993.643, 'duration': 2.462}, {'end': 2997.085, 'text': 'It was really excellent.', 'start': 2996.125, 'duration': 0.96}, {'end': 2997.786, 'text': 'I love your work.', 'start': 2997.125, 'duration': 0.661}, {'end': 3006.031, 'text': "One thing that I want to ask more about is when you're working in the iterative design process with all these non-coders and developing your synthesizers,", 'start': 2998.206, 'duration': 7.825}], 'summary': 'Discussion on iterative design process with non-coders and synthesizers.', 'duration': 38.854, 'max_score': 2967.177, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2967177.jpg'}], 'start': 2305.123, 'title': 'Implementing user-friendly programming tools', 'summary': 'Discusses implementing a bi-level dsl to provide a user-friendly high-level view while maintaining a low-level view for robustness in programming tools, emphasizing the benefits of hci techniques in pl and the creation of tools like helena for journalists and rebecca for lawyers, catering to a spectrum of users.', 'chapters': [{'end': 2464.087, 'start': 2305.123, 'title': 'Bi-level dsl for usable drafting tools', 'summary': 'Discusses the implementation of a bi-level dsl to provide a user-friendly high-level view while maintaining a low-level view for robustness in programming tools for non-technical domains, highlighting the need for both usability and robustness.', 'duration': 158.964, 'highlights': ['The implementation of a bi-level DSL to provide a user-friendly high-level view while maintaining a low-level view for robustness in programming tools for non-technical domains', 'The challenge of balancing robustness and usability in programming tools, where the need for a low-level language conflicts with the desire to present a more user-friendly interface', 'The use of a reverse compiler to map a user-friendly representation to a low-level language, allowing for both friendly display and actual program execution']}, {'end': 3007.873, 'start': 2465.64, 'title': 'Hci-pl hybrid work: enriching programming tools and languages', 'summary': 'Discusses the use of hci techniques in pl, emphasizing the benefits of need finding, iterative refinement, and diverse user input, leading to the creation of tools like helena for journalists and rebecca for lawyers from unstructured data, and the application of techniques across a spectrum of users, from novice coders to climate economists.', 'duration': 542.233, 'highlights': ['The iterative refinement process has led to the creation of tools like Helena for journalists and Rebecca for lawyers from unstructured data, through diverse user input, iterating and augmenting the tools over a number of years. Creation of tools like Helena for journalists and Rebecca for lawyers, iterative refinement process, augmenting the tools over a number of years', 'The application of techniques across a spectrum of users, from novice coders to climate economists, has shown the effectiveness of the same techniques for users with varying levels of programming experience. Application of techniques across a spectrum of users, effectiveness for users with varying levels of programming experience', 'The use of HCI-PL hybrid work has been successful in collecting rental data from 100 major metropolitan areas around the US every single day to shape housing voucher programs. Collecting rental data from 100 major metropolitan areas around the US every single day']}], 'duration': 702.75, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk2305123.jpg', 'highlights': ['The use of a reverse compiler to map a user-friendly representation to a low-level language, allowing for both friendly display and actual program execution', 'The challenge of balancing robustness and usability in programming tools, where the need for a low-level language conflicts with the desire to present a more user-friendly interface', 'The implementation of a bi-level DSL to provide a user-friendly high-level view while maintaining a low-level view for robustness in programming tools for non-technical domains', 'The application of techniques across a spectrum of users, from novice coders to climate economists, has shown the effectiveness of the same techniques for users with varying levels of programming experience', 'The iterative refinement process has led to the creation of tools like Helena for journalists and Rebecca for lawyers from unstructured data, through diverse user input, iterating and augmenting the tools over a number of years', 'The use of HCI-PL hybrid work has been successful in collecting rental data from 100 major metropolitan areas around the US every single day to shape housing voucher programs']}, {'end': 3748.821, 'segs': [{'end': 3323.938, 'src': 'embed', 'start': 3279.87, 'weight': 0, 'content': [{'end': 3287.602, 'text': "Again, purely anecdotal, I don't want to say that this is a really well-supported claim about this tool.", 'start': 3279.87, 'duration': 7.732}, {'end': 3295.293, 'text': 'But it definitely gives me hope that this is a direction that will actually make it possible for folks who really have no programming background at all.', 'start': 3288.063, 'duration': 7.23}, {'end': 3297.496, 'text': 'to get up and running with this kind of tool.', 'start': 3295.854, 'duration': 1.642}, {'end': 3303.403, 'text': 'I really do think, if we can surface the program that comes out and make it in a language that they can actually understand,', 'start': 3297.596, 'duration': 5.807}, {'end': 3306.346, 'text': "with really no training at all, I think we're on a good path.", 'start': 3303.403, 'duration': 2.943}, {'end': 3310.411, 'text': 'But yeah, I completely agree that this is one of the huge obstacles to this style of tool.', 'start': 3306.967, 'duration': 3.444}, {'end': 3318.814, 'text': 'I will go.', 'start': 3317.413, 'duration': 1.401}, {'end': 3323.938, 'text': 'Could you talk a little bit more about how you find collaborators who you think have interesting needs?', 'start': 3318.914, 'duration': 5.024}], 'summary': "There's hope for non-programmers to use a tool without any training, making it accessible and beneficial.", 'duration': 44.068, 'max_score': 3279.87, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk3279870.jpg'}, {'end': 3410.044, 'src': 'embed', 'start': 3384.44, 'weight': 2, 'content': [{'end': 3390.707, 'text': 'And then, once we have entered into a particular space and are working with a few collaborators, in terms of building trust,', 'start': 3384.44, 'duration': 6.267}, {'end': 3393.31, 'text': 'the biggest thing is again the open communication.', 'start': 3390.707, 'duration': 2.603}, {'end': 3403.661, 'text': 'The second biggest thing is if we have some kind of prototype or can build them some kind of hand-tuned thing, something specifically for their needs.', 'start': 3393.831, 'duration': 9.83}, {'end': 3410.044, 'text': "even if it hasn't solved the general problem yet, that is often enough to make the process worthwhile to them, even in the short term.", 'start': 3403.661, 'duration': 6.383}], 'summary': 'Open communication and customized prototypes build trust and make the process worthwhile for collaborators.', 'duration': 25.604, 'max_score': 3384.44, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk3384440.jpg'}, {'end': 3530.234, 'src': 'embed', 'start': 3503.28, 'weight': 3, 'content': [{'end': 3509.302, 'text': 'does the goal of this particular collaborator align with something that the students who are working on this project are really passionate about? Right.', 'start': 3503.28, 'duration': 6.022}, {'end': 3515.084, 'text': "So they're going to be most excited to do work on something that sort of aligns with their values and their goals in life.", 'start': 3509.402, 'duration': 5.682}, {'end': 3516.905, 'text': "So that's huge for us, for sure.", 'start': 3515.684, 'duration': 1.221}, {'end': 3527.112, 'text': "another big thing that we enforce that we're we're filtering for is that the the tools that we are competent to produce are actually going to be useful to them.", 'start': 3517.905, 'duration': 9.207}, {'end': 3530.234, 'text': 'right like it would be great to work with someone who,', 'start': 3527.112, 'duration': 3.122}], 'summary': "Collaborator's goal aligns with students' passion and values, ensuring useful tools for them.", 'duration': 26.954, 'max_score': 3503.28, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk3503280.jpg'}, {'end': 3658.553, 'src': 'embed', 'start': 3624.702, 'weight': 4, 'content': [{'end': 3630.245, 'text': 'We have personally chosen not to do a lot of recording of what they are doing with our programming tool.', 'start': 3624.702, 'duration': 5.543}, {'end': 3632.025, 'text': 'This is for a number of reasons.', 'start': 3630.905, 'duration': 1.12}, {'end': 3638.627, 'text': "So one is the expectations around how much data is collected about you when you're using a programming tool compared to when you're using, say,", 'start': 3632.065, 'duration': 6.562}, {'end': 3640.667, 'text': 'a web page are quite different at this point.', 'start': 3638.627, 'duration': 2.04}, {'end': 3644.908, 'text': "It would be pretty unusual to have data collected about you when you're just using a tool.", 'start': 3640.707, 'duration': 4.201}, {'end': 3654.01, 'text': 'We have started to go down the road of. if you opt in, we will collect some data not for this particular tool but for another tool in a related study.', 'start': 3645.848, 'duration': 8.162}, {'end': 3658.553, 'text': "But really only if, you know, you're opting in pretty regularly.", 'start': 3655.491, 'duration': 3.062}], 'summary': 'Limited data collection for programming tool, opt-in required for related study, uncommon to collect data while using a tool.', 'duration': 33.851, 'max_score': 3624.702, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk3624702.jpg'}], 'start': 3007.973, 'title': 'Tool development challenges', 'summary': 'Discusses challenges in developing tools for non-coders, emphasizing the need for alignment of goals, open communication, and potential impact on usability. it also highlights the importance of collaboration and progress monitoring for producing useful programming tools.', 'chapters': [{'end': 3403.661, 'start': 3007.973, 'title': 'Balancing tool development for non-coders', 'summary': 'Discusses the challenges of creating tools for non-coders, emphasizing the need for alignment of goals, open communication, and potential impact on usability, with anecdotal evidence suggesting potential success in enabling non-programmers to use the tool.', 'duration': 395.688, 'highlights': ['The chapter discusses the challenges of creating tools for non-coders, emphasizing the need for alignment of goals, open communication, and potential impact on usability. The chapter highlights the challenges of creating tools for non-coders and stresses the importance of aligning goals and open communication to ensure potential impact on usability.', 'Anecdotal evidence suggests potential success in enabling non-programmers to use the tool. Anecdotal evidence suggests that the tool has been successful in enabling non-programmers to use it, providing hope for its potential to cater to individuals with no programming background.', 'The importance of open communication and aligning goals with collaborators is emphasized in building trust and finding the right partners. Open communication and aligning goals with collaborators are highlighted as crucial factors in building trust and finding suitable partners for collaboration.']}, {'end': 3748.821, 'start': 3403.661, 'title': 'Collaboration and monitoring progress in programming tools', 'summary': "Discusses how collaboration with collaborators aligning with the team's values and goals, and monitoring progress through selective data collection are crucial for producing useful programming tools.", 'duration': 345.16, 'highlights': ["Collaborators are chosen based on alignment with the team's values and goals, ensuring they are excited to work on projects in line with their passions and that the tools produced will be useful to them.", 'Selective data collection is practiced, with opt-in options and respect for user privacy, to monitor progress and gather information about tool usage, although the data collected is limited and not representative of all users.', "The expectation of data collection when using programming tools is different from web pages, and the team has chosen not to extensively record user activities, respecting the user's privacy and needs.", "Collaboration with collaborators aligning with the team's values and goals is prioritized, ensuring the team's passion aligns with the projects and that the tools produced will be useful to them.", 'The team practices selective data collection, respecting user privacy and offering opt-in options, to monitor progress and gather information about tool usage, although the data collected is limited and not representative of all users.']}], 'duration': 740.848, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/KvMnpVMp0jk/pics/KvMnpVMp0jk3007973.jpg', 'highlights': ['The chapter discusses challenges in developing tools for non-coders, emphasizing the need for alignment of goals, open communication, and potential impact on usability.', 'Anecdotal evidence suggests potential success in enabling non-programmers to use the tool, providing hope for its potential to cater to individuals with no programming background.', 'The importance of open communication and aligning goals with collaborators is emphasized in building trust and finding the right partners.', "Collaborators are chosen based on alignment with the team's values and goals, ensuring they are excited to work on projects in line with their passions and that the tools produced will be useful to them.", 'Selective data collection is practiced, with opt-in options and respect for user privacy, to monitor progress and gather information about tool usage, although the data collected is limited and not representative of all users.', "The expectation of data collection when using programming tools is different from web pages, and the team has chosen not to extensively record user activities, respecting the user's privacy and needs."]}], 'highlights': ['The Plate Lab focuses on creating approachable and inclusive programming languages and tools to bridge the gap with social sciences.', "The lab's primary goal is to enable social science visionaries to try new techniques and automation processes.", 'Building strong collaborations with social scientists is critical for getting tools adopted and ensuring their usefulness, as emphasized throughout the talk.', "The speaker's transition from a primarily programming languages background to realizing the need for Human-Computer Interaction (HCI) techniques in their work.", "Discussing the integration of HCI techniques into programming languages, particularly focusing on the development of the tool 'Helena' aimed at social scientists.", 'The use of HCI-PL hybrid work has been successful in collecting rental data from 100 major metropolitan areas around the US every single day to shape housing voucher programs', 'The challenges of setting rent thresholds using outdated census data and the unintended consequences of funneling low-income families into low-opportunity neighborhoods are discussed, emphasizing the need for real-time data to guide housing voucher programs.', 'The success rate of a web automation tool in completing tasks, with 100% completion rate within 10 minutes for computer science PhD students compared to only a quarter completing the task within an hour without the tool, demonstrating the effectiveness of the tool in simplifying web automation programming for users.', 'Developing a solution for web automation involved addressing challenges such as extracting dynamic content, reverse engineering page structure, and understanding server communication, which led to the concept of programming by demonstration (PVD) as a solution for web automation.', 'The need for single demonstration usability, editable programs, and the ability to synthesize complex programs is emphasized.', "The shift to a 'demonstrate once, revise anytime' workflow for program synthesis is discussed, allowing users to edit and revise the synthesized program until satisfaction.", 'The challenges in program synthesis and the need to relax constraints for developers are outlined.', 'Program synthesis faces challenges with nested loops, exponentially expanding the search space.', 'Handling multiple loops intensifies complexity, posing significant challenges in program synthesis.', 'Enforcing a user contract simplifies web data synthesis, resulting in more manageable and successful outcomes.', 'Users perceive Helena as more usable and learnable than traditional web automation libraries.', 'Enforcing a contract with the user limits the search space of programs, making the synthesis process easier.', 'Comparison shows users find Helena more usable and learnable than Selenium, leading to successful task completion.', 'The use of a reverse compiler to map a user-friendly representation to a low-level language, allowing for both friendly display and actual program execution', 'The challenge of balancing robustness and usability in programming tools, where the need for a low-level language conflicts with the desire to present a more user-friendly interface', 'The implementation of a bi-level DSL to provide a user-friendly high-level view while maintaining a low-level view for robustness in programming tools for non-technical domains', 'The application of techniques across a spectrum of users, from novice coders to climate economists, has shown the effectiveness of the same techniques for users with varying levels of programming experience', 'The iterative refinement process has led to the creation of tools like Helena for journalists and Rebecca for lawyers from unstructured data, through diverse user input, iterating and augmenting the tools over a number of years', 'The chapter discusses challenges in developing tools for non-coders, emphasizing the need for alignment of goals, open communication, and potential impact on usability.', 'Anecdotal evidence suggests potential success in enabling non-programmers to use the tool, providing hope for its potential to cater to individuals with no programming background.', 'The importance of open communication and aligning goals with collaborators is emphasized in building trust and finding the right partners.', "Collaborators are chosen based on alignment with the team's values and goals, ensuring they are excited to work on projects in line with their passions and that the tools produced will be useful to them.", 'Selective data collection is practiced, with opt-in options and respect for user privacy, to monitor progress and gather information about tool usage, although the data collected is limited and not representative of all users.', "The expectation of data collection when using programming tools is different from web pages, and the team has chosen not to extensively record user activities, respecting the user's privacy and needs."]}