title
Lecture 16 | Adversarial Examples and Adversarial Training
description
In Lecture 16, guest lecturer Ian Goodfellow discusses adversarial examples in deep learning. We discuss why deep networks and other machine learning models are susceptible to adversarial examples, and how adversarial examples can be used to attack machine learning systems. We discuss potential defenses against adversarial examples, and uses for adversarial examples for improving machine learning systems even without an explicit adversary.
Keywords: Adversarial examples, Fooling images, fast gradient sign method, Clever Hans, adversarial defenses, adversarial examples in the physical world, adversarial training, virtual adversarial training, model-based optimization
Slides: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture16.pdf
--------------------------------------------------------------------------------------
Convolutional Neural Networks for Visual Recognition
Instructors:
Fei-Fei Li: http://vision.stanford.edu/feifeili/
Justin Johnson: http://cs.stanford.edu/people/jcjohns/
Serena Yeung: http://ai.stanford.edu/~syyeung/
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This lecture collection is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. From this lecture collection, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision.
Website:
http://cs231n.stanford.edu/
For additional learning opportunities please visit:
http://online.stanford.edu/
detail
{'title': 'Lecture 16 | Adversarial Examples and Adversarial Training', 'heatmap': [{'end': 1182.084, 'start': 1077.546, 'weight': 1}, {'end': 1573.675, 'start': 1518.257, 'weight': 0.875}, {'end': 4027.946, 'start': 3973.597, 'weight': 0.723}, {'end': 4126.56, 'start': 4072.708, 'weight': 0.72}], 'summary': "Covers adversarial examples in machine learning, discussing their impact, vulnerabilities, and potential defenses, such as the fgsm attack's high success rate of over 99% on regular neural networks and the effectiveness of adversarial training for neural nets in reducing test error rates, within the context of the rapid advancement of deep learning in the early 2010s.", 'chapters': [{'end': 162.964, 'segs': [{'end': 72.801, 'src': 'embed', 'start': 29.292, 'weight': 1, 'content': [{'end': 38.214, 'text': "I'll talk a little bit about how adversarial examples pose real world security threats that they can actually be used to compromise systems built on machine learning.", 'start': 29.292, 'duration': 8.922}, {'end': 46.976, 'text': "I'll tell you what the defenses are so far, but mostly defenses are an open research problem that I hope some of you will move on to tackle.", 'start': 39.474, 'duration': 7.502}, {'end': 54.357, 'text': "And then, finally, I'll tell you how to use adversarial examples to improve other machine learning algorithms,", 'start': 48.136, 'duration': 6.221}, {'end': 59.118, 'text': "even if you want to build a machine learning algorithm that won't face a real world adversary.", 'start': 54.357, 'duration': 4.761}, {'end': 66.237, 'text': 'Looking at the big picture and the context for this lecture,', 'start': 62.094, 'duration': 4.143}, {'end': 72.801, 'text': "I think most of you are probably here because you've heard how incredibly powerful and successful machine learning is.", 'start': 66.237, 'duration': 6.564}], 'summary': 'Adversarial examples pose security threats in machine learning systems, with open research on defenses and potential for algorithm improvement.', 'duration': 43.509, 'max_score': 29.292, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI29292.jpg'}, {'end': 115.821, 'src': 'embed', 'start': 92.043, 'weight': 0, 'content': [{'end': 99.028, 'text': 'In about 2013, we started to see that deep learning achieved human level performance at a lot of different tasks.', 'start': 92.043, 'duration': 6.985}, {'end': 107.475, 'text': 'We saw that convolutional nets could recognize objects and images and score about the same as people in those benchmarks.', 'start': 99.729, 'duration': 7.746}, {'end': 115.821, 'text': "With the caveat that part of the reason that algorithms score as well as people is that people can't tell Alaskan Huskies from Siberian Huskies very well.", 'start': 107.995, 'duration': 7.826}], 'summary': 'Deep learning reached human-level performance in object recognition around 2013.', 'duration': 23.778, 'max_score': 92.043, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI92043.jpg'}], 'start': 4.858, 'title': 'Adversarial examples and machine learning', 'summary': 'Explores adversarial examples in machine learning, their real-world security threats, the open research problem of defenses, and the potential use of adversarial examples to improve other machine learning algorithms, within the context of the rapid advancement of deep learning in the early 2010s.', 'chapters': [{'end': 162.964, 'start': 4.858, 'title': 'Adversarial examples and machine learning', 'summary': 'Explores adversarial examples in machine learning, their real-world security threats, the open research problem of defenses, and the potential use of adversarial examples to improve other machine learning algorithms, within the context of the rapid advancement of deep learning in the early 2010s.', 'duration': 158.106, 'highlights': ['Adversarial examples pose real world security threats and can be used to compromise systems built on machine learning, highlighting the urgency of developing effective defenses and solutions.', 'Deep learning achieved human level performance at various tasks around 2013, including object recognition, face recognition, and reading typewritten fonts in photos, marking a significant advancement in machine learning capabilities.', 'The limitations of human performance in distinguishing certain objects contributed to the perception of algorithms reaching human-level performance, emphasizing the need for comprehensive and nuanced evaluation metrics in machine learning benchmarks.', 'The potential for using adversarial examples to improve other machine learning algorithms presents an opportunity for further advancement and optimization in the field of machine learning, encouraging future research and exploration in this area.']}], 'duration': 158.106, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI4858.jpg', 'highlights': ['Deep learning achieved human level performance at various tasks around 2013, marking a significant advancement in machine learning capabilities.', 'Adversarial examples pose real world security threats and can be used to compromise systems built on machine learning, highlighting the urgency of developing effective defenses and solutions.', 'The potential for using adversarial examples to improve other machine learning algorithms presents an opportunity for further advancement and optimization in the field of machine learning, encouraging future research and exploration in this area.', 'The limitations of human performance in distinguishing certain objects contributed to the perception of algorithms reaching human-level performance, emphasizing the need for comprehensive and nuanced evaluation metrics in machine learning benchmarks.']}, {'end': 580.342, 'segs': [{'end': 211.76, 'src': 'embed', 'start': 162.964, 'weight': 0, 'content': [{'end': 171.806, 'text': 'It even got to the point that we can no longer use CAPTCHAs to tell whether a user of a webpage is human or not,', 'start': 162.964, 'duration': 8.842}, {'end': 176.267, 'text': 'because the convolutional network is better at reading obfuscated text than a human is.', 'start': 171.806, 'duration': 4.461}, {'end': 182.409, 'text': 'So, with this context today of deep learning working really well, especially for computer vision,', 'start': 177.401, 'duration': 5.008}, {'end': 187.664, 'text': "It's a little bit unusual to think about the computer making a mistake.", 'start': 183.581, 'duration': 4.083}, {'end': 192.567, 'text': 'Before about 2013, nobody was ever surprised if the computer made a mistake.', 'start': 188.404, 'duration': 4.163}, {'end': 194.989, 'text': 'That was the rule, not the exception.', 'start': 193.007, 'duration': 1.982}, {'end': 201.513, 'text': "And so today's topic, which is all about unusual mistakes that deep learning algorithms make.", 'start': 195.549, 'duration': 5.964}, {'end': 208.177, 'text': "this topic wasn't really a serious avenue of study until the algorithms started to work well most of the time.", 'start': 201.513, 'duration': 6.664}, {'end': 211.76, 'text': 'And now we will study the way that they break.', 'start': 208.998, 'duration': 2.762}], 'summary': 'Deep learning algorithms make unusual mistakes due to their high success rate.', 'duration': 48.796, 'max_score': 162.964, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI162964.jpg'}, {'end': 267.031, 'src': 'embed', 'start': 238.357, 'weight': 1, 'content': [{'end': 243.8, 'text': 'and a convolutional network trained on the ImageNet dataset is able to recognize it as being a panda.', 'start': 238.357, 'duration': 5.443}, {'end': 248.461, 'text': "One interesting thing is that the model doesn't have a whole lot of confidence in that decision.", 'start': 244.74, 'duration': 3.721}, {'end': 253.223, 'text': 'It assigns about 60% probability to this image being a panda.', 'start': 249.002, 'duration': 4.221}, {'end': 260.906, 'text': 'If we then compute exactly the way that we could modify the image to cause the convolutional network to make a mistake,', 'start': 254.624, 'duration': 6.282}, {'end': 267.031, 'text': 'we find that the optimal direction to move all the pixels is given by this image in the middle.', 'start': 262.007, 'duration': 5.024}], 'summary': 'A convolutional network identifies a panda with 60% probability from an imagenet dataset.', 'duration': 28.674, 'max_score': 238.357, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI238357.jpg'}, {'end': 428.57, 'src': 'embed', 'start': 397.287, 'weight': 5, 'content': [{'end': 400.93, 'text': "A lot of what I'll be telling you about today is my own follow-up work on this topic,", 'start': 397.287, 'duration': 3.643}, {'end': 410.798, 'text': "but I've spent a lot of my career over the past few years understanding why these attacks are possible and why it's so easy to fool these convolutional networks.", 'start': 400.93, 'duration': 9.868}, {'end': 420.164, 'text': 'When my colleague Christian first discovered this phenomenon, independently from Battista Biggio,', 'start': 413.539, 'duration': 6.625}, {'end': 428.57, 'text': 'but around the same time he found that it was actually a result of a visualization he was trying to make.', 'start': 420.164, 'duration': 8.406}], 'summary': 'Research focuses on understanding and addressing vulnerabilities in convolutional networks.', 'duration': 31.283, 'max_score': 397.287, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI397287.jpg'}, {'end': 511.191, 'src': 'embed', 'start': 479.804, 'weight': 4, 'content': [{'end': 483.966, 'text': 'So each of these panels here shows an animation that you read left to right, top to bottom.', 'start': 479.804, 'duration': 4.162}, {'end': 494.291, 'text': 'Each panel is another step of gradient ascent on the log probability that the input is an airplane, according to a convolutional net model.', 'start': 484.746, 'duration': 9.545}, {'end': 499.156, 'text': 'and then we follow the gradient on the input to the image.', 'start': 495.391, 'duration': 3.765}, {'end': 502.46, 'text': "You're probably used to following the gradient on the parameters of a model.", 'start': 499.657, 'duration': 2.803}, {'end': 511.191, 'text': 'You can use the back propagation algorithm to compute the gradient on the input image using exactly the same procedure that you would use to compute the gradient on the parameters.', 'start': 502.941, 'duration': 8.25}], 'summary': 'Demonstrates using back propagation to compute gradient on input image for airplane detection.', 'duration': 31.387, 'max_score': 479.804, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI479804.jpg'}], 'start': 162.964, 'title': 'Unusual mistakes and adversarial attacks in deep learning', 'summary': 'Discusses the rise of deep learning in computer vision, surpassing human ability in tasks such as reading obfuscated text. it delves into studying unusual mistakes made by deep learning algorithms and explores adversarial attacks on convolutional networks, demonstrating how small modifications can cause misclassification, with examples showing surprising results of fooling neural networks.', 'chapters': [{'end': 211.76, 'start': 162.964, 'title': 'Unusual mistakes in deep learning', 'summary': 'Discusses the rise of deep learning, particularly in computer vision, and how it has surpassed human ability in tasks like reading obfuscated text, leading to a shift from expecting mistakes to studying the unusual mistakes made by deep learning algorithms.', 'duration': 48.796, 'highlights': ['Deep learning algorithms now outperform humans in tasks like reading obfuscated text.', 'Before 2013, it was the norm for computers to make mistakes, but now they are expected to work well most of the time.', 'The topic of studying unusual mistakes made by deep learning algorithms was not taken seriously until the algorithms started to work well most of the time.']}, {'end': 580.342, 'start': 212.942, 'title': 'Adversarial attacks on convolutional networks', 'summary': 'Explores adversarial attacks on convolutional networks, demonstrating how small, carefully computed modifications can cause a convolutional network trained on the imagenet dataset to misclassify an image, with examples showing how a panda can be manipulated to be recognized as a gibbon with 99.9% probability, highlighting the ease and history of fooling neural networks and the surprising results of gradient ascent on the log probability of an input being an airplane.', 'duration': 367.4, 'highlights': ['An example demonstrates how a panda can be manipulated to be recognized as a gibbon with 99.9% probability, showcasing the ease of fooling convolutional networks.', 'Discussing the history of fooling machine learning models, dating back to at least 2004, and the ease of fooling convolutional networks, highlighting the long-standing nature and simplicity of these attacks.', 'Surprising results are shown from gradient ascent on the log probability of an input being an airplane, with minimal perceptible changes to images causing the network to be completely confident in misclassification, emphasizing the unexpected outcomes of this experiment.', 'The chapter delves into the process of using carefully computed modifications to cause a convolutional network to misclassify an image, with an example demonstrating how a panda can be manipulated to be recognized as a gibbon with 99.9% probability, showcasing the ease of fooling convolutional networks.']}], 'duration': 417.378, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI162964.jpg', 'highlights': ['Deep learning algorithms now outperform humans in tasks like reading obfuscated text.', 'An example demonstrates how a panda can be manipulated to be recognized as a gibbon with 99.9% probability, showcasing the ease of fooling convolutional networks.', 'Before 2013, it was the norm for computers to make mistakes, but now they are expected to work well most of the time.', 'The topic of studying unusual mistakes made by deep learning algorithms was not taken seriously until the algorithms started to work well most of the time.', 'Surprising results are shown from gradient ascent on the log probability of an input being an airplane, with minimal perceptible changes to images causing the network to be completely confident in misclassification, emphasizing the unexpected outcomes of this experiment.', 'Discussing the history of fooling machine learning models, dating back to at least 2004, and the ease of fooling convolutional networks, highlighting the long-standing nature and simplicity of these attacks.', 'The chapter delves into the process of using carefully computed modifications to cause a convolutional network to misclassify an image, with an example demonstrating how a panda can be manipulated to be recognized as a gibbon with 99.9% probability, showcasing the ease of fooling convolutional networks.']}, {'end': 1136.125, 'segs': [{'end': 641.584, 'src': 'embed', 'start': 580.342, 'weight': 0, 'content': [{'end': 585.487, 'text': 'has found an image that fools the network into thinking that the input is an airplane.', 'start': 580.342, 'duration': 5.145}, {'end': 591.352, 'text': "And if we were malicious attackers, we didn't even have to work very hard to figure out how to fool the network.", 'start': 586.728, 'duration': 4.624}, {'end': 598.478, 'text': 'We just asked the network to give us an image of an airplane and it gave us something that fools it into thinking that the input is an airplane.', 'start': 591.952, 'duration': 6.526}, {'end': 608.8, 'text': 'When Christian first published this work, a lot of articles came out with titles like the flaw lurking in every deep neural network,', 'start': 601.454, 'duration': 7.346}, {'end': 610.461, 'text': 'or deep learning has deep flaws.', 'start': 608.8, 'duration': 1.661}, {'end': 618.007, 'text': "It's important to remember that these vulnerabilities apply to essentially every machine learning algorithm that we've studied so far.", 'start': 611.422, 'duration': 6.585}, {'end': 625.513, 'text': 'Some of them like RBF networks and pars and density estimators are able to resist this effect somewhat.', 'start': 619.368, 'duration': 6.145}, {'end': 630.977, 'text': 'But even very simple machine learning algorithms are highly vulnerable to adversarial examples.', 'start': 626.173, 'duration': 4.804}, {'end': 637.482, 'text': 'In this image, I show an animation of what happens when we attack a linear model.', 'start': 632.799, 'duration': 4.683}, {'end': 641.584, 'text': "So it's not a deep algorithm at all, it's just a shallow softmax model.", 'start': 638.042, 'duration': 3.542}], 'summary': 'Machine learning algorithms are highly vulnerable to adversarial examples, including deep neural networks and simple models.', 'duration': 61.242, 'max_score': 580.342, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI580342.jpg'}, {'end': 883.886, 'src': 'embed', 'start': 857.331, 'weight': 4, 'content': [{'end': 862.312, 'text': 'we would expect it to make different random mistakes on these points that are off the training set.', 'start': 857.331, 'duration': 4.981}, {'end': 865.473, 'text': 'But that was actually not what we found at all.', 'start': 863.253, 'duration': 2.22}, {'end': 873.536, 'text': 'We found that many different models would misclassify the same adversarial examples and they would assign the same class to them.', 'start': 865.994, 'duration': 7.542}, {'end': 883.886, 'text': 'We also found that if we took the difference between an original example and an adversarial example, then we had a direction in input space.', 'start': 874.52, 'duration': 9.366}], 'summary': 'Various models misclassify same adversarial examples with consistent results.', 'duration': 26.555, 'max_score': 857.331, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI857331.jpg'}, {'end': 1012.439, 'src': 'embed', 'start': 984.079, 'weight': 5, 'content': [{'end': 990.684, 'text': "We've seen that linear models can actually assign really unusual confidence as you move very far from the decision boundary,", 'start': 984.079, 'duration': 6.605}, {'end': 991.945, 'text': "even if there isn't any data there.", 'start': 990.684, 'duration': 1.261}, {'end': 996.868, 'text': 'But are deep neural networks actually anything like linear models?', 'start': 992.845, 'duration': 4.023}, {'end': 1001.171, 'text': 'Could linear models actually explain anything about how it is that deep neural nets fail?', 'start': 996.968, 'duration': 4.203}, {'end': 1005.894, 'text': 'It turns out that modern deep neural nets are actually very piecewise linear.', 'start': 1002.132, 'duration': 3.762}, {'end': 1012.439, 'text': "So rather than being a single linear function, they're piecewise linear with maybe not that many linear pieces.", 'start': 1006.755, 'duration': 5.684}], 'summary': 'Deep neural networks are piecewise linear with not many linear pieces.', 'duration': 28.36, 'max_score': 984.079, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI984079.jpg'}], 'start': 580.342, 'title': 'Adversarial vulnerabilities in machine learning', 'summary': 'Discusses the vulnerability of various machine learning algorithms to adversarial examples, including deep neural networks, and highlights the ease with which linear models can be fooled, the systematic effects leading to misclassifications, and the piecewise linear nature of modern deep neural networks.', 'chapters': [{'end': 699.006, 'start': 580.342, 'title': 'Adversarial vulnerabilities in machine learning', 'summary': 'Discusses the vulnerability of various machine learning algorithms to adversarial examples, including deep neural networks, and presents an image that successfully fools a network into misclassifying input as an airplane.', 'duration': 118.664, 'highlights': ["Christian's work highlighted vulnerabilities in deep neural networks, with articles describing flaws in every deep neural network and emphasizing that these vulnerabilities apply to essentially every studied machine learning algorithm.", 'Even simple machine learning algorithms are highly vulnerable to adversarial examples, as demonstrated by an animation showing the attack on a linear model, which successfully misclassifies images of different digits.', 'The image presented successfully fools the network into thinking the input is an airplane, showcasing the ease with which malicious attackers can exploit vulnerabilities in machine learning algorithms.']}, {'end': 1136.125, 'start': 700.532, 'title': 'Adversarial examples in machine learning', 'summary': 'Discusses the phenomenon of adversarial examples, highlighting the ease with which linear models can be fooled, the systematic effects leading to misclassifications, and the piecewise linear nature of modern deep neural networks.', 'duration': 435.593, 'highlights': ['Linear models can be easily fooled by adversarial examples', 'Systematic effects in misclassifications of adversarial examples', 'Piecewise linear nature of modern deep neural networks']}], 'duration': 555.783, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI580342.jpg', 'highlights': ["Christian's work highlighted vulnerabilities in deep neural networks, emphasizing flaws in every deep neural network (relevance: 5)", 'Even simple machine learning algorithms are highly vulnerable to adversarial examples, as demonstrated by an animation showing the attack on a linear model (relevance: 4)', 'The image presented successfully fools the network into thinking the input is an airplane, showcasing the ease with which malicious attackers can exploit vulnerabilities in machine learning algorithms (relevance: 3)', 'Linear models can be easily fooled by adversarial examples (relevance: 2)', 'Systematic effects in misclassifications of adversarial examples (relevance: 1)', 'Piecewise linear nature of modern deep neural networks (relevance: 0)']}, {'end': 1854.742, 'segs': [{'end': 1276.925, 'src': 'embed', 'start': 1249.853, 'weight': 2, 'content': [{'end': 1256.597, 'text': 'You might think that maybe a deep net is going to represent some extremely wiggly, complicated function with lots and lots of linear pieces,', 'start': 1249.853, 'duration': 6.744}, {'end': 1258.179, 'text': 'no matter which cross-section you look at.', 'start': 1256.597, 'duration': 1.582}, {'end': 1264.103, 'text': 'Or we might find that it has more or less two pieces for each function we look at.', 'start': 1259.52, 'duration': 4.583}, {'end': 1270.241, 'text': 'Each of the different curves on this plot is the logits for a different class.', 'start': 1265.578, 'duration': 4.663}, {'end': 1276.925, 'text': 'We see that out at the tails of the plot that the frog class is the most likely.', 'start': 1271.222, 'duration': 5.703}], 'summary': 'Deep net represents complex functions with multiple linear pieces, each curve shows logits for a different class, frog class is most likely at the tails.', 'duration': 27.072, 'max_score': 1249.853, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1249853.jpg'}, {'end': 1477.922, 'src': 'embed', 'start': 1443.737, 'weight': 3, 'content': [{'end': 1446.379, 'text': 'you make perturbations that have even larger L2 norm.', 'start': 1443.737, 'duration': 2.642}, {'end': 1450.261, 'text': "What's going on is that there are several different pixels in the image,", 'start': 1447.239, 'duration': 3.022}, {'end': 1455.305, 'text': 'and so small changes to individual pixels can add up to relatively large vectors.', 'start': 1450.261, 'duration': 5.044}, {'end': 1459.647, 'text': "For larger data sets like ImageNet, where there's even more pixels.", 'start': 1456.385, 'duration': 3.262}, {'end': 1466.272, 'text': 'you can make very small changes to each pixel but travel very far in vector space as measured by the L2 norm.', 'start': 1459.647, 'duration': 6.625}, {'end': 1477.922, 'text': 'That means that you can actually make changes that are almost imperceptible but actually move you really far and get a large dot product with the coefficients of the linear function that the model represents.', 'start': 1467.452, 'duration': 10.47}], 'summary': 'Small changes to pixels in imagenet lead to large vectors in l2 norm space.', 'duration': 34.185, 'max_score': 1443.737, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1443737.jpg'}, {'end': 1573.675, 'src': 'heatmap', 'start': 1507.376, 'weight': 0, 'content': [{'end': 1511.758, 'text': "we want to make sure that we're actually fooling it and not just changing the input class.", 'start': 1507.376, 'duration': 4.382}, {'end': 1516.68, 'text': "And if we're an attacker, we actually want to make sure that we're causing misbehavior in the system.", 'start': 1512.238, 'duration': 4.442}, {'end': 1524.38, 'text': 'To do that, when we build adversarial examples, we use the max norm to constrain the perturbation.', 'start': 1518.257, 'duration': 6.123}, {'end': 1529.023, 'text': 'Basically this says that no pixel can change by more than some amount epsilon.', 'start': 1525.361, 'duration': 3.662}, {'end': 1536.386, 'text': "So the L2 norm can get really big, but you can't concentrate all the changes for that L2 norm to erase pieces of the digit.", 'start': 1529.543, 'duration': 6.843}, {'end': 1539.268, 'text': 'Like in the bottom row here, we erased the top of a three.', 'start': 1536.947, 'duration': 2.321}, {'end': 1551.641, 'text': 'One very fast way to build an adversarial example is just to take the gradient of the cost that you used to train the network with respect to the input and then take the sine of that gradient.', 'start': 1541.298, 'duration': 10.343}, {'end': 1556.023, 'text': 'The sine is essentially enforcing the max norm constraint.', 'start': 1552.942, 'duration': 3.081}, {'end': 1561.344, 'text': "You're only allowed to change the input by up to epsilon at each pixel.", 'start': 1556.563, 'duration': 4.781}, {'end': 1566.526, 'text': 'So if you just take the sine, it tells you whether you want to add epsilon or subtract epsilon in order to hurt the network.', 'start': 1561.724, 'duration': 4.802}, {'end': 1573.675, 'text': 'You can view this as taking the observation that the network is more or less linear, as we showed on this slide,', 'start': 1567.711, 'duration': 5.964}], 'summary': 'Creating adversarial examples using max norm perturbation and sine gradient for misbehavior in the system.', 'duration': 44.265, 'max_score': 1507.376, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1507376.jpg'}, {'end': 1658.409, 'src': 'embed', 'start': 1626.942, 'weight': 1, 'content': [{'end': 1635.346, 'text': "when you bring out a more powerful method that takes longer to evaluate, they find that they can't overcome the more computationally expensive attack.", 'start': 1626.942, 'duration': 8.404}, {'end': 1642.285, 'text': "I've told you that adversarial examples happen because the model is very linear.", 'start': 1638.84, 'duration': 3.445}, {'end': 1648.494, 'text': 'And then I told you that we could use this linearity assumption to build this attack, the fast gradient sign method.', 'start': 1642.966, 'duration': 5.528}, {'end': 1658.409, 'text': "This method when applied to a regular neural network that doesn't have any special defenses will get over a 99% attack success rate.", 'start': 1649.767, 'duration': 8.642}], 'summary': 'Adversarial attack success rate over 99% using fast gradient sign method on regular neural network.', 'duration': 31.467, 'max_score': 1626.942, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1626942.jpg'}], 'start': 1138.087, 'title': 'Adversarial attacks on cnns', 'summary': "Discusses the impact of adversarial examples on cnns, showcasing drastic changes in model output through small perturbations and explaining the fgsm attack's high success rate of over 99% on regular neural networks.", 'chapters': [{'end': 1551.641, 'start': 1138.087, 'title': 'Adversarial examples and cnn interpretation', 'summary': "Explains how perturbing input images in the direction of the network's misclassification causes drastic changes in model output, as evidenced by the creation of adversarial examples with small l2 norm perturbations and the interpretation of the logits for different classes in a convolutional network.", 'duration': 413.554, 'highlights': ['Creation of Adversarial Examples', 'Interpretation of Logits in Convolutional Network', 'Importance of Constraining Perturbations in Adversarial Examples']}, {'end': 1854.742, 'start': 1552.942, 'title': 'Fgsm attack and adversarial examples', 'summary': 'Explains the fast gradient sign method (fgsm) attack, which leverages the linearity assumption of neural networks to quickly generate adversarial examples, achieving over a 99% attack success rate on regular neural networks without defenses.', 'duration': 301.8, 'highlights': ['The Fast Gradient Sign Method (FGSM) attack can achieve over a 99% attack success rate on regular neural networks without defenses.', 'The FGSM attack leverages the linearity assumption of neural networks to quickly generate adversarial examples.', 'Adversarial examples live more or less in linear subspaces, near linear decision boundaries, and can be rapidly generated using the FGSM attack.']}], 'duration': 716.655, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1138087.jpg', 'highlights': ['The Fast Gradient Sign Method (FGSM) attack can achieve over a 99% attack success rate on regular neural networks without defenses.', 'Adversarial examples live more or less in linear subspaces, near linear decision boundaries, and can be rapidly generated using the FGSM attack.', 'Interpretation of Logits in Convolutional Network', 'Importance of Constraining Perturbations in Adversarial Examples', 'Creation of Adversarial Examples', 'The FGSM attack leverages the linearity assumption of neural networks to quickly generate adversarial examples.']}, {'end': 2571.612, 'segs': [{'end': 1930.08, 'src': 'embed', 'start': 1898.749, 'weight': 4, 'content': [{'end': 1901.11, 'text': "You can add a lot of noise to a clean example, and it'll stay clean.", 'start': 1898.749, 'duration': 2.361}, {'end': 1905.773, 'text': 'Here we make random cross sections where both axes are randomly chosen directions.', 'start': 1901.751, 'duration': 4.022}, {'end': 1913.998, 'text': "And you see that on CIFAR-10, most of the cells are completely white, meaning that they're correctly classified to start with, and when you add noise,", 'start': 1906.494, 'duration': 7.504}, {'end': 1915.299, 'text': 'they stay correctly classified.', 'start': 1913.998, 'duration': 1.301}, {'end': 1919.047, 'text': 'We also see that the model makes some mistakes because this is the test set.', 'start': 1915.839, 'duration': 3.208}, {'end': 1924.139, 'text': "And generally if a test example starts out misclassified, adding the noise doesn't change it.", 'start': 1919.729, 'duration': 4.41}, {'end': 1930.08, 'text': 'There are a few exceptions where, if you look in the third row and third column,', 'start': 1924.636, 'duration': 5.444}], 'summary': "Adding noise to clean examples doesn't change correct classifications; misclassified examples remain unchanged.", 'duration': 31.331, 'max_score': 1898.749, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1898749.jpg'}, {'end': 2167.141, 'src': 'embed', 'start': 2118.894, 'weight': 0, 'content': [{'end': 2125.236, 'text': 'A lot of the time in the adversarial example research community, we refer back to the story of Clever Hans.', 'start': 2118.894, 'duration': 6.342}, {'end': 2131.137, 'text': 'This comes from an essay by Bob Sturm called Clever Hans, Clever Algorithms,', 'start': 2126.076, 'duration': 5.061}, {'end': 2135.759, 'text': "because Clever Hans is a pretty good metaphor for what's happening with machine learning algorithms.", 'start': 2131.137, 'duration': 4.622}, {'end': 2139.941, 'text': 'So Clever Hans was a horse that lived in the early 1900s.', 'start': 2136.559, 'duration': 3.382}, {'end': 2143.602, 'text': 'His owner trained him to do arithmetic problems.', 'start': 2140.841, 'duration': 2.761}, {'end': 2149.665, 'text': "So you could ask him, Clever Hans, what's two plus one? And he would answer by tapping his hoof.", 'start': 2144.263, 'duration': 5.402}, {'end': 2159.63, 'text': "And after the third tap, everybody would start cheering and clapping and looking excited because he'd actually done an arithmetic problem.", 'start': 2153.307, 'duration': 6.323}, {'end': 2163.78, 'text': "Well, it turned out that he hadn't actually learned to do arithmetic.", 'start': 2160.939, 'duration': 2.841}, {'end': 2167.141, 'text': 'But it was actually pretty hard to figure out what was going on.', 'start': 2164.26, 'duration': 2.881}], 'summary': 'Clever hans, a horse trained for arithmetic, serves as a metaphor for machine learning challenges.', 'duration': 48.247, 'max_score': 2118.894, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI2118894.jpg'}, {'end': 2393.867, 'src': 'embed', 'start': 2367.549, 'weight': 2, 'content': [{'end': 2374.391, 'text': 'If you have a softmax classifier, it has to give you a distribution over the N different classes that you train it on.', 'start': 2367.549, 'duration': 6.842}, {'end': 2379.355, 'text': "So there's a few ways that you can argue that the model is telling you that there's something rather than nothing.", 'start': 2375.091, 'duration': 4.264}, {'end': 2386.801, 'text': 'One is you can say if it assigns something like 90% to one particular class, that seems to be voting for that class being there.', 'start': 2379.955, 'duration': 6.846}, {'end': 2388.282, 'text': "We'd much rather see it.", 'start': 2387.261, 'duration': 1.021}, {'end': 2393.867, 'text': "give us something like a uniform distribution, saying this noise doesn't look like anything in the training set,", 'start': 2388.282, 'duration': 5.585}], 'summary': 'Softmax classifier provides distribution over n classes, favoring 90% for a class to indicate presence.', 'duration': 26.318, 'max_score': 2367.549, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI2367549.jpg'}], 'start': 1855.182, 'title': 'Adversarial examples in ml and subspace', 'summary': 'Explores the generation of adversarial examples in machine learning, their impact on classification decisions, and the discovery of adversarial regions, with an average of 25 dimensions, while showcasing the vulnerability of machine learning algorithms to adversarial examples.', 'chapters': [{'end': 1958.025, 'start': 1855.182, 'title': 'Adversarial examples in ml', 'summary': 'Discusses the generation of adversarial examples in machine learning, demonstrating that adding noise to clean examples has little effect on classification decisions compared to adversarial examples, with the majority of cells correctly classified even after adding noise, and that adversarial examples are not noise, as they can cross into a two-dimensional subspace of adversarial examples on cifar-10.', 'duration': 102.843, 'highlights': ['Adversarial examples are not noise, as they can cross into a two-dimensional subspace on CIFAR-10, with the majority of cells staying correctly classified even after adding noise.', 'Adding noise to clean examples has little effect on classification decisions compared to adversarial examples, as demonstrated on CIFAR-10.', 'In some cases, noise can make the model misclassify the example for especially large noise values, but for the most part, noise has very little effect on the classification decision compared to adversarial examples.', "In high dimensional spaces, adversarial examples can be generated by manipulating the model's decision boundaries, leading to misclassifications and demonstrating the vulnerability of machine learning models."]}, {'end': 2571.612, 'start': 1958.025, 'title': 'Adversarial subspace and machine learning', 'summary': 'Discusses the discovery of adversarial regions with an average of 25 dimensions, the implications of subspace dimensionality on model transferability, and the analogy of machine learning algorithms to the story of clever hans, showcasing their vulnerability to adversarial examples.', 'duration': 613.587, 'highlights': ['Adversarial region discovered to have an average of about 25 dimensions, impacting the likelihood of finding adversarial examples by generating random noise.', 'Subspace dimensionality of adversarial regions influences model transferability, with larger dimensions increasing the likelihood of intersecting subspaces and enabling transfer of adversarial examples between models.', 'Analogy drawn between machine learning algorithms and the story of Clever Hans, illustrating how algorithms, like the horse, can be easily fooled by adversarial examples due to their focus on linear patterns and susceptibility to distribution shifts.']}], 'duration': 716.43, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI1855182.jpg', 'highlights': ['Adversarial region discovered with an average of about 25 dimensions', "In high dimensional spaces, adversarial examples manipulate model's decision boundaries", 'Subspace dimensionality influences model transferability', 'Adversarial examples can cross into a two-dimensional subspace on CIFAR-10', 'Noise has little effect on classification decisions compared to adversarial examples']}, {'end': 3451.561, 'segs': [{'end': 2882.914, 'src': 'embed', 'start': 2851.551, 'weight': 3, 'content': [{'end': 2856.374, 'text': "So they're extremely difficult to train, even with batch normalization and methods like that.", 'start': 2851.551, 'duration': 4.823}, {'end': 2864.81, 'text': "I haven't managed to train a deep RBF network yet, but I think if somebody comes up with better hyperparameters or a new,", 'start': 2857.549, 'duration': 7.261}, {'end': 2867.151, 'text': 'more powerful optimization algorithm,', 'start': 2864.81, 'duration': 2.341}, {'end': 2882.914, 'text': "it might be possible to solve the adversarial example problem by training a deep RBF network where the model is so nonlinear and has such wide flat areas that the adversary is not able to push the cost uphill just by making small changes to the model's input.", 'start': 2867.151, 'duration': 15.763}], 'summary': 'Training a deep rbf network may solve the adversarial example problem with better hyperparameters or a more powerful optimization algorithm.', 'duration': 31.363, 'max_score': 2851.551, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI2851551.jpg'}, {'end': 3185.003, 'src': 'embed', 'start': 3157.008, 'weight': 1, 'content': [{'end': 3159.369, 'text': "Don Song's group at UC Berkeley studied this.", 'start': 3157.008, 'duration': 2.361}, {'end': 3171.696, 'text': 'They found that if they take an ensemble of different models and they use gradient descent to search for an adversarial example that will fool every member of their ensemble,', 'start': 3160.39, 'duration': 11.306}, {'end': 3177.199, 'text': "then it's extremely likely that it will transfer and fool a new machine learning model.", 'start': 3171.696, 'duration': 5.503}, {'end': 3185.003, 'text': "So if you have an ensemble of five models, you can get it to the point where there's essentially a 100% chance that you'll fool a sixth model.", 'start': 3177.699, 'duration': 7.304}], 'summary': 'Uc berkeley study shows 100% success in fooling new ml model with ensemble of 5 models', 'duration': 27.995, 'max_score': 3157.008, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3157008.jpg'}, {'end': 3242.928, 'src': 'embed', 'start': 3218.008, 'weight': 0, 'content': [{'end': 3225.792, 'text': 'If you make an ensemble that omits ResNet-152, in their experiments they found that there was a 0% chance of ResNet-152 resisting that attack.', 'start': 3218.008, 'duration': 7.784}, {'end': 3234.679, 'text': 'That probably indicates they should have run some more adversarial examples until they found a non-zero success rate.', 'start': 3229.794, 'duration': 4.885}, {'end': 3242.928, 'text': 'But it does show that the attack is very powerful and that when you go looking to intentionally cause the transfer effect,', 'start': 3235.38, 'duration': 7.548}], 'summary': 'Omitting resnet-152 in ensemble leads to 0% resistance against attack, indicating need for more adversarial examples.', 'duration': 24.92, 'max_score': 3218.008, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3218008.jpg'}, {'end': 3441.053, 'src': 'embed', 'start': 3409.597, 'weight': 2, 'content': [{'end': 3415.399, 'text': 'Oh, so if we look at, for example, this picture of the panda.', 'start': 3409.597, 'duration': 5.802}, {'end': 3416.76, 'text': 'To us, it looks like a panda.', 'start': 3415.619, 'duration': 1.141}, {'end': 3419.081, 'text': 'To most machine learning models, it looks like a gibbon.', 'start': 3417.18, 'duration': 1.901}, {'end': 3426.764, 'text': "And so this change isn't interfering with our brains, but it fools reliably with lots of different machine learning models.", 'start': 3420.641, 'duration': 6.123}, {'end': 3441.053, 'text': 'I saw somebody actually took this image of the perturbation out of our paper and they pasted it on their Facebook profile picture to see if it could interfere with Facebook recognizing them.', 'start': 3429.744, 'duration': 11.309}], 'summary': 'Adversarial perturbations can fool machine learning models, e.g., panda image recognized as gibbon.', 'duration': 31.456, 'max_score': 3409.597, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3409597.jpg'}], 'start': 2572.553, 'title': 'Adversarial examples in machine learning', 'summary': 'Discusses the vulnerability of machine learning models to adversarial examples, demonstrating how logistic regression models and deep rbf networks can be fooled, the transferability of adversarial examples, and the potential for improving machine learning algorithms.', 'chapters': [{'end': 2654.174, 'start': 2572.553, 'title': 'Adversarial example research', 'summary': 'Discusses the simplicity of high dimensional linear models and the use of the fast gradient sign method to attack a logistic regression model, demonstrating how the sign of the weights can be manipulated to create adversarial examples.', 'duration': 81.621, 'highlights': ['The fast gradient sign method can be used to attack a logistic regression model by manipulating the sign of the weights, demonstrating the simplicity and vulnerability of high dimensional linear models.', "Logistic regression model can be described by a weight vector and a single scalar bias term, simplifying the model's structure and making it vulnerable to attacks.", "The sign of the weights in a logistic regression model gives the sign of the gradient, which can be manipulated to create adversarial examples, showcasing the model's susceptibility to attacks.", "The weights used to discriminate sevens and threes in the logistic regression model should resemble the difference between the average seven and three, providing insight into the model's decision-making process."]}, {'end': 3451.561, 'start': 2655.578, 'title': 'Adversarial examples in machine learning', 'summary': 'Discusses the vulnerability of machine learning models to adversarial examples, showcasing how logistic regression models and deep rbf networks can be fooled, the transferability of adversarial examples across different models and datasets, and the potential for improving machine learning algorithms through the study of adversarial examples.', 'duration': 795.983, 'highlights': ['Logistic regression and deep RBF networks are vulnerable to adversarial examples, with the ability to be fooled into misclassifying images by perturbing the weights or using template matching, respectively.', 'Adversarial examples generalize across different datasets and models, with transfer rates ranging from 60 to 80%, and the possibility of intentionally causing the transfer effect by leveraging ensemble models.', 'Research indicates that adversarial examples do not have a significant impact on human perception, suggesting a fundamental difference between human and machine learning algorithms, potentially leading to insights for improving machine learning models.', 'The transfer effect can be leveraged to fool machine learning classifiers hosted by various companies, and the vulnerability of these classifiers to adversarial examples has been demonstrated through practical attacks.']}], 'duration': 879.008, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI2572553.jpg', 'highlights': ['The transfer effect can be leveraged to fool machine learning classifiers hosted by various companies, and the vulnerability of these classifiers to adversarial examples has been demonstrated through practical attacks.', 'Adversarial examples generalize across different datasets and models, with transfer rates ranging from 60 to 80%, and the possibility of intentionally causing the transfer effect by leveraging ensemble models.', 'The fast gradient sign method can be used to attack a logistic regression model by manipulating the sign of the weights, demonstrating the simplicity and vulnerability of high dimensional linear models.', 'Logistic regression and deep RBF networks are vulnerable to adversarial examples, with the ability to be fooled into misclassifying images by perturbing the weights or using template matching, respectively.']}, {'end': 4039.996, 'segs': [{'end': 3589.097, 'src': 'embed', 'start': 3536.517, 'weight': 2, 'content': [{'end': 3539.818, 'text': "we're also showing that there's transfer across the model that you use.", 'start': 3536.517, 'duration': 3.301}, {'end': 3546.681, 'text': "So the attacker could conceivably fool a system that's deployed in a physical agent,", 'start': 3540.538, 'duration': 6.143}, {'end': 3554.684, 'text': "even if they don't have access to the model on that agent and even if they can't interface directly with the agent but just modify,", 'start': 3546.681, 'duration': 8.003}, {'end': 3558.486, 'text': 'subtly modify objects that it can see in its environment.', 'start': 3554.684, 'duration': 3.802}, {'end': 3573.785, 'text': 'So I think a lot of that comes back to the maps that I showed earlier.', 'start': 3569.642, 'duration': 4.143}, {'end': 3583.772, 'text': "That if you cross over the boundary into the realm of adversarial examples, they occupy a pretty wide space and they're very densely packed in there.", 'start': 3574.285, 'duration': 9.487}, {'end': 3589.097, 'text': "So if you jostle around a little bit, you're not going to recover from the adversarial attack.", 'start': 3584.172, 'duration': 4.925}], 'summary': 'Transferability of adversarial attacks across models can compromise physical agents, even without direct access.', 'duration': 52.58, 'max_score': 3536.517, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3536517.jpg'}, {'end': 3685.221, 'src': 'embed', 'start': 3661.091, 'weight': 0, 'content': [{'end': 3667.439, 'text': 'Nicholas Carlini at Berkeley just released a paper where he shows that 10 of those defenses are broken.', 'start': 3661.091, 'duration': 6.348}, {'end': 3670.482, 'text': 'So this is a really, really hard problem.', 'start': 3668.74, 'duration': 1.742}, {'end': 3674.908, 'text': "You can't just make it go away by using traditional regularization techniques.", 'start': 3670.883, 'duration': 4.025}, {'end': 3680.016, 'text': 'Particular generative models are not enough to solve the problem.', 'start': 3676.193, 'duration': 3.823}, {'end': 3685.221, 'text': "A lot of people say, oh, the problem that's going on here is that you don't know anything about the distribution over the input pixels.", 'start': 3680.377, 'duration': 4.844}], 'summary': 'Carlini at berkeley finds 10 broken defenses against attacks on input pixels.', 'duration': 24.13, 'max_score': 3661.091, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3661091.jpg'}, {'end': 4027.946, 'src': 'heatmap', 'start': 3973.597, 'weight': 0.723, 'content': [{'end': 3975.957, 'text': 'Training on adversarial examples is a good regularizer.', 'start': 3973.597, 'duration': 2.36}, {'end': 3978.738, 'text': "If you're overfitting, it can make you overfit less.", 'start': 3976.498, 'duration': 2.24}, {'end': 3981.199, 'text': "If you're underfitting, it'll just make you underfit worse.", 'start': 3979.098, 'duration': 2.101}, {'end': 3987.243, 'text': "Other kinds of models besides deep neural nets don't benefit as much from adversarial training.", 'start': 3982.859, 'duration': 4.384}, {'end': 3993.608, 'text': 'So when we started this whole topic of study, we thought that deep neural nets might be uniquely vulnerable to adversarial examples.', 'start': 3988.023, 'duration': 5.585}, {'end': 3998.973, 'text': "But it turns out that actually, they're one of the few models that has a clear path to resisting them.", 'start': 3994.189, 'duration': 4.784}, {'end': 4001.455, 'text': 'Linear models are just always going to be linear.', 'start': 3999.673, 'duration': 1.782}, {'end': 4004.297, 'text': "They don't have much hope of resisting adversarial examples.", 'start': 4001.875, 'duration': 2.422}, {'end': 4006.979, 'text': 'Deep neural nets can be trained to be nonlinear.', 'start': 4005.018, 'duration': 1.961}, {'end': 4010.042, 'text': "And so it seems like there's a path to a solution for them.", 'start': 4007.54, 'duration': 2.502}, {'end': 4019.659, 'text': "Even with adversarial training, we still find that we aren't able to make models where, if you optimize the input to belong to different classes,", 'start': 4011.673, 'duration': 7.986}, {'end': 4021.041, 'text': 'you get examples of those classes.', 'start': 4019.659, 'duration': 1.382}, {'end': 4027.946, 'text': 'Here I start out with a CIFAR 10 truck, and I turn it into each of the 10 different CIFAR 10 classes.', 'start': 4021.861, 'duration': 6.085}], 'summary': 'Adversarial training is effective for deep neural nets, providing a clear path to resistance and improved nonlinearity.', 'duration': 54.349, 'max_score': 3973.597, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3973597.jpg'}], 'start': 3455.319, 'title': 'Adversarial attacks and defenses', 'summary': 'Discusses practical significance of fooling malware detectors and object recognition systems using gan, challenges of adversarial attacks on machine learning models, and the effectiveness of adversarial training for neural nets, showcasing significant reduction in test error rates.', 'chapters': [{'end': 3536.517, 'start': 3455.319, 'title': 'Fooling malware detectors and object recognition systems', 'summary': 'Discusses the practical significance of fooling malware detectors and object recognition systems, highlighting the use of gan to generate adversarial examples for malware detectors and the successful fooling of an object recognition system running on a phone through printed adversarial examples.', 'duration': 81.198, 'highlights': ['Using GAN to generate adversarial examples for malware detectors, as demonstrated by the model called MalGAN, has high practical significance.', 'Demonstrating the successful fooling of an object recognition system running on a phone through printed adversarial examples, despite differences between the system on the camera and the model used to generate the adversarial examples.', 'The study by Catherine Gross at the University of Starland and the paper co-authored by Alex Karakin and Sammy Bengio, indicating the increasing interest in fooling malware detectors and object recognition systems for real-world applications.']}, {'end': 3767.18, 'start': 3536.517, 'title': 'Adversarial attacks and defenses', 'summary': 'Discusses the challenges of adversarial attacks on machine learning models, showcasing the difficulty of defending against them and highlighting the limitations of traditional defenses, with examples of broken defenses and the importance of posterior distribution over class labels.', 'duration': 230.663, 'highlights': ["Adversarial attacks on machine learning models can transfer across different models and physical agents, allowing attackers to subtly modify objects in the agent's environment to fool the system.", 'Defending against adversarial attacks is challenging, as even published defenses have been shown to be broken, presenting a significant obstacle in making systems more robust.', 'The posterior distribution over class labels y given inputs x is crucial in defending against adversarial attacks, emphasizing the limitations of using generative models alone to solve the problem.']}, {'end': 4039.996, 'start': 3767.341, 'title': 'Adversarial training for neural nets', 'summary': 'Discusses the effectiveness of adversarial training for neural nets, indicating that training on adversarial examples can significantly reduce test error rates and help in resisting attacks, particularly for deep neural nets.', 'duration': 272.655, 'highlights': ['Training on adversarial examples significantly reduces test error rates, with the error dropping to less than 1% when trained on adversarial examples compared to clean examples only.', "Deep neural nets can be trained to resist adversarial examples and have a clear path to resisting them, unlike linear models which are always linear and don't have much hope of resisting adversarial examples.", 'Adversarial training acts as a good regularizer, reducing overfitting and helping in doing the original task better.', 'The choice of the family of generative model has a big effect on whether the posterior becomes deterministic or uniform, as the model extrapolates, and designing a rich, deep generative model that can generate realistic ImageNet images and calculate its posterior distribution accurately may improve the approach.']}], 'duration': 584.677, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI3455319.jpg', 'highlights': ['Training on adversarial examples reduces test error rates to less than 1% compared to clean examples only.', 'Adversarial training acts as a good regularizer, reducing overfitting and improving the original task.', 'Adversarial attacks on machine learning models can transfer across different models and physical agents.', 'Using GAN to generate adversarial examples for malware detectors, as demonstrated by MalGAN, has high practical significance.', 'Defending against adversarial attacks is challenging, as even published defenses have been shown to be broken.']}, {'end': 4880.327, 'segs': [{'end': 4126.56, 'src': 'heatmap', 'start': 4072.708, 'weight': 0.72, 'content': [{'end': 4075.709, 'text': "Then we make an adversarial perturbation that's intended to change the guess.", 'start': 4072.708, 'duration': 3.001}, {'end': 4079.711, 'text': 'And we just try to make it say, this is a truck or something like that.', 'start': 4076.67, 'duration': 3.041}, {'end': 4081.192, 'text': "It's not whatever you believed it was before.", 'start': 4079.871, 'duration': 1.321}, {'end': 4087.114, 'text': 'You can then train it to say that the distribution of our classes should still be the same as it was before.', 'start': 4082.092, 'duration': 5.022}, {'end': 4090.276, 'text': 'That this should still be considered probably a bird or a plane.', 'start': 4087.615, 'duration': 2.661}, {'end': 4095.221, 'text': 'This technique is called virtual adversarial training and it was invented by Takeru Miyato.', 'start': 4091.178, 'duration': 4.043}, {'end': 4098.904, 'text': 'He was my intern at Google after he did this work.', 'start': 4096.142, 'duration': 2.762}, {'end': 4100.145, 'text': 'At Google.', 'start': 4099.625, 'duration': 0.52}, {'end': 4106.991, 'text': 'we invited him to come and apply his invention to text classification,', 'start': 4100.145, 'duration': 6.846}, {'end': 4116.999, 'text': 'because this ability to learn from unlabeled examples makes it possible to do semi-supervised learning where you learn from both unlabeled and labeled examples,', 'start': 4106.991, 'duration': 10.008}, {'end': 4119.12, 'text': "and there's quite a lot of unlabeled text in the world.", 'start': 4116.999, 'duration': 2.121}, {'end': 4126.56, 'text': 'So we were able to bring down the error rate on several different text classification tasks by using this virtual adversarial training.', 'start': 4119.921, 'duration': 6.639}], 'summary': 'Virtual adversarial training reduces error rate in text classification tasks by using unlabeled data.', 'duration': 53.852, 'max_score': 4072.708, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI4072708.jpg'}, {'end': 4376.437, 'src': 'embed', 'start': 4348.468, 'weight': 1, 'content': [{'end': 4352.031, 'text': 'Yeah, so the question is, is it possible to identify which layer contributes the most to this issue?', 'start': 4348.468, 'duration': 3.563}, {'end': 4368.773, 'text': "One thing is that if you The last layer is somewhat important because say that you made a feature extractor that's completely robust to adversarial perturbations and can shrink them to be very,", 'start': 4354.252, 'duration': 14.521}, {'end': 4369.253, 'text': 'very small.', 'start': 4368.773, 'duration': 0.48}, {'end': 4371.114, 'text': 'And then the last layer is still linear.', 'start': 4369.553, 'duration': 1.561}, {'end': 4376.437, 'text': 'Then it has all the problems that are typically associated with linear models.', 'start': 4372.555, 'duration': 3.882}], 'summary': 'Identifying the layer contributing the most to the issue is crucial for addressing problems associated with linear models.', 'duration': 27.969, 'max_score': 4348.468, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI4348468.jpg'}, {'end': 4488.22, 'src': 'embed', 'start': 4462.834, 'weight': 3, 'content': [{'end': 4468.97, 'text': "So you're gonna actually see, Like in this one here, every time we do an epoch,", 'start': 4462.834, 'duration': 6.136}, {'end': 4473.492, 'text': "we've generated the same number of adversarial examples as there are training examples.", 'start': 4468.97, 'duration': 4.522}, {'end': 4478.235, 'text': 'So every epoch here is 50, 000 adversarial examples.', 'start': 4474.133, 'duration': 4.102}, {'end': 4488.22, 'text': 'You can see that adversarial training is a very data-hungry process that you need to make new adversarial examples every time you update the weights.', 'start': 4478.835, 'duration': 9.385}], 'summary': 'Adversarial training generates 50,000 examples per epoch, indicating its data-intensive nature.', 'duration': 25.386, 'max_score': 4462.834, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI4462834.jpg'}, {'end': 4780.951, 'src': 'embed', 'start': 4683.248, 'weight': 0, 'content': [{'end': 4692.511, 'text': "And in some of the other slides, like these maps, you don't get that effect where subtracting epsilon off eventually boosts the adversarial class.", 'start': 4683.248, 'duration': 9.263}, {'end': 4700.305, 'text': "Part of what's going on is I think I'm using larger epsilon here and so you might eventually see that effect if I made these maps wider.", 'start': 4693.581, 'duration': 6.724}, {'end': 4708.971, 'text': "I made the maps narrower because it's like quadratic time to build a 2D map and it's linear time to build a 1D cross section.", 'start': 4700.746, 'duration': 8.225}, {'end': 4715.716, 'text': "So I just didn't afford the GPU time to make the maps quite as wide.", 'start': 4710.672, 'duration': 5.044}, {'end': 4720.38, 'text': 'I also think that this might just be a weird effect that happened randomly on this one example.', 'start': 4716.116, 'duration': 4.264}, {'end': 4723.983, 'text': "It's not something that I remember being used to seeing a lot of the time.", 'start': 4721.06, 'duration': 2.923}, {'end': 4732.75, 'text': "Most things that I observed don't happen perfectly consistently, but if they happen like 80% of the time, then I'll put them in my slide.", 'start': 4725.003, 'duration': 7.747}, {'end': 4736.453, 'text': "A lot of what we're doing is just trying to figure out more or less what's going on.", 'start': 4734.071, 'duration': 2.382}, {'end': 4743.9, 'text': "And so if we find that something happens 80% of the time, then I consider it to be the dominant phenomenon that we're trying to explain.", 'start': 4736.473, 'duration': 7.427}, {'end': 4750.406, 'text': "And after we've got a better explanation for that, then I might start to try to explain some of the weirder things that happen,", 'start': 4744.701, 'duration': 5.705}, {'end': 4752.468, 'text': 'like the frog happening with negative epsilon.', 'start': 4750.406, 'duration': 2.062}, {'end': 4764.669, 'text': "I didn't fully understand the question.", 'start': 4763.444, 'duration': 1.225}, {'end': 4775.689, 'text': "It's about the dimensionality of the.", 'start': 4764.689, 'duration': 11}, {'end': 4780.951, 'text': 'Oh, okay, so the question is how is the dimension of the adversarial subspace related to the dimension of the input?', 'start': 4775.689, 'duration': 5.262}], 'summary': 'Researcher discusses dimensions and consistency in observing adversarial effects in neural networks.', 'duration': 97.703, 'max_score': 4683.248, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI4683248.jpg'}], 'start': 4042.526, 'title': 'Virtual adversarial training and adversarial examples', 'summary': 'Delves into virtual adversarial training, a label-free training technique by takeru miyato, enabling semi-supervised learning and reducing error rates in text classification. additionally, it explores adversarial perturbations, resistance at different layers, data-hungry nature of adversarial training, and challenges in model-based optimization.', 'chapters': [{'end': 4219.347, 'start': 4042.526, 'title': 'Virtual adversarial training', 'summary': 'Discusses virtual adversarial training, a technique to train models without labels, invented by takeru miyato, which enables semi-supervised learning, reduces error rates in text classification tasks, and holds potential for advancing technological innovations.', 'duration': 176.821, 'highlights': ['Virtual adversarial training enables semi-supervised learning by utilizing both labeled and unlabeled examples, leading to a reduction in error rates for various text classification tasks', 'Takeru Miyato invented virtual adversarial training and applied it to text classification tasks at Google, effectively reducing error rates', 'The potential of virtual adversarial training to unlock technological advances, such as designing new genes, molecules for medicinal drugs, and faster circuits for GPUs, is highlighted']}, {'end': 4880.327, 'start': 4219.347, 'title': 'Adversarial examples and model-based optimization', 'summary': 'Discusses the concept of adversarial perturbations, the impact of different layers in resisting perturbations, the data-hungry nature of adversarial training, and the challenges of model-based optimization in machine learning.', 'duration': 660.98, 'highlights': ['The subspace of adversarial perturbations is only about 50 dimensional, even if the input dimension is 3000 dimensional.', 'Adversarial training involves generating the same number of adversarial examples as there are training examples for each epoch, making it a data-hungry process.', 'The model-based optimization aims to find inputs that maximize the output of the model, which can lead to finding adversarial examples that fool the model into perceiving the input as something desirable.']}], 'duration': 837.801, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/CIfsB_EYsVI/pics/CIfsB_EYsVI4042526.jpg', 'highlights': ['Virtual adversarial training enables semi-supervised learning by utilizing both labeled and unlabeled examples, leading to a reduction in error rates for various text classification tasks', 'Takeru Miyato invented virtual adversarial training and applied it to text classification tasks at Google, effectively reducing error rates', 'The potential of virtual adversarial training to unlock technological advances, such as designing new genes, molecules for medicinal drugs, and faster circuits for GPUs, is highlighted', 'The subspace of adversarial perturbations is only about 50 dimensional, even if the input dimension is 3000 dimensional', 'Adversarial training involves generating the same number of adversarial examples as there are training examples for each epoch, making it a data-hungry process', 'The model-based optimization aims to find inputs that maximize the output of the model, which can lead to finding adversarial examples that fool the model into perceiving the input as something desirable']}], 'highlights': ['Deep learning achieved human level performance around 2013 (Ch.1)', 'The FGSM attack achieves over 99% success rate on regular neural networks (Ch.4)', 'Adversarial training reduces test error rates to less than 1% (Ch.7)', 'Adversarial examples can be used to compromise systems built on machine learning (Ch.1)', 'Adversarial examples pose real world security threats (Ch.1)', 'Adversarial examples can transfer across different datasets and models (Ch.6)', 'Virtual adversarial training reduces error rates for various text classification tasks (Ch.8)', 'Adversarial examples live more or less in linear subspaces (Ch.4)', "Adversarial examples can manipulate model's decision boundaries in high dimensional spaces (Ch.5)", 'The potential for using adversarial examples to improve other machine learning algorithms presents an opportunity for further advancement (Ch.1)']}