title
Geoffrey Hinton Unpacks The Forward-Forward Algorithm
description
In this episode, Geoffrey Hinton, a renowned computer scientist and a leading expert in deep learning, provides an in-depth exploration of his groundbreaking new learning algorithm - the forward-forward algorithm. Hinton argues this algorithm provides a more plausible model for how the cerebral cortex might learn, and could be the key to unlocking new possibilities in artificial intelligence.
Throughout the episode, Hinton discusses the mechanics of the forward-forward algorithm, including how it differs from traditional deep learning models and what makes it more effective. He also provides insights into the potential applications of this new algorithm, such as enabling machines to perform tasks that were previously thought to be exclusive to human cognition.
Hinton shares his thoughts on the current state of deep learning and its future prospects, particularly in neuroscience. He explores how advances in deep learning may help us gain a better understanding of our own brains and how we can use this knowledge to create more intelligent machines.
Overall, this podcast provides a fascinating glimpse into the latest developments in artificial intelligence and the cutting-edge research being conducted by one of its leading pioneers.
Craig Smith Twitter: https://twitter.com/craigss
Eye on A.I. Twitter: https://twitter.com/EyeOn_AI
detail
{'title': 'Geoffrey Hinton Unpacks The Forward-Forward Algorithm', 'heatmap': [{'end': 568.77, 'start': 138.672, 'weight': 0.992}, {'end': 854.616, 'start': 636.857, 'weight': 0.781}, {'end': 1064.008, 'start': 881.175, 'weight': 0.722}, {'end': 2130.474, 'start': 2083.215, 'weight': 0.782}, {'end': 2440.038, 'start': 2401.422, 'weight': 0.744}, {'end': 3009.165, 'start': 2967.136, 'weight': 0.812}, {'end': 3258.366, 'start': 3217.102, 'weight': 0.72}], 'summary': 'Geoffrey hinton introduces the forward-forward algorithm, addressing challenges of backpropagation and discussing evolving hidden layers, learning phases, discrimination, and scaling algorithms in neural nets, emphasizing potential applications for information processing in the cerebral cortex.', 'chapters': [{'end': 313.892, 'segs': [{'end': 70.896, 'src': 'embed', 'start': 38.833, 'weight': 0, 'content': [{'end': 49.527, 'text': 'While his application of the back propagation of error algorithm to deep networks set off a revolution in artificial intelligence.', 'start': 38.833, 'duration': 10.694}, {'end': 56.091, 'text': "He doesn't believe that it explains how the brain processes information.", 'start': 50.268, 'duration': 5.823}, {'end': 65.275, 'text': 'Late last year, he introduced a new learning algorithm, which he calls the forward-forward algorithm,', 'start': 56.931, 'duration': 8.344}, {'end': 70.896, 'text': 'that he believes is a more plausible model for how the cerebral cortex might learn.', 'start': 65.275, 'duration': 5.621}], 'summary': 'Back propagation revolutionized ai, but new forward-forward algorithm may better model cerebral cortex learning.', 'duration': 32.063, 'max_score': 38.833, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ38833.jpg'}, {'end': 167.035, 'src': 'embed', 'start': 100.921, 'weight': 2, 'content': [{'end': 109.646, 'text': "Before we begin, I'd like to mention our sponsor, ClearML, an open source end-to-end ML ops solution.", 'start': 100.921, 'duration': 8.725}, {'end': 120.596, 'text': "You can try it for free at clear.ml, that's C-L-E-A-R.ML, Tell them I on AI sent you.", 'start': 110.246, 'duration': 10.35}, {'end': 122.18, 'text': "Now here's Jeff.", 'start': 121.198, 'duration': 0.982}, {'end': 127.232, 'text': 'I hope you find the conversation as fascinating as I did.', 'start': 122.781, 'duration': 4.451}, {'end': 150.021, 'text': "I wanted to have you explain to listeners of forward-forward networks and why you're looking for something beyond backpropagation,", 'start': 138.672, 'duration': 11.349}, {'end': 152.443, 'text': 'despite its tremendous success.', 'start': 150.021, 'duration': 2.422}, {'end': 156.927, 'text': "Let me start with explaining why I don't believe the brain is doing backpropagation.", 'start': 152.843, 'duration': 4.084}, {'end': 160.79, 'text': 'One thing about backpropagation is you need to have a perfect model of the forward system.', 'start': 157.007, 'duration': 3.783}, {'end': 167.035, 'text': "That is, in backpropagation, it's easiest to think about for a layered net, but it also works for recurrent nets.", 'start': 161.61, 'duration': 5.425}], 'summary': 'Discussion on forward-forward networks and limitations of backpropagation in neural networks.', 'duration': 66.114, 'max_score': 100.921, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ100921.jpg'}], 'start': 0.009, 'title': 'The forward-forward algorithm and challenges of backpropagation', 'summary': "Delves into jeffrey hinton's introduction of the forward-forward algorithm, revolutionizing ai. it discusses challenges of applying backpropagation, highlighting the need for a perfect model of the forward system, running recurrent nets backwards through time, and the lack of evidence supporting the brain's use of backpropagation.", 'chapters': [{'end': 150.021, 'start': 0.009, 'title': 'The forward-forward algorithm', 'summary': "Delves into jeffrey hinton's introduction of the forward-forward algorithm as a more plausible model for how the brain processes information, setting off a revolution in artificial intelligence and provides insights into the algorithm and the journey that led to it.", 'duration': 150.012, 'highlights': ["Jeffrey Hinton introduced the forward-forward algorithm as a more plausible model for how the cerebral cortex might learn. This highlights Hinton's contribution to a new learning algorithm, emphasizing its potential in understanding brain processes.", "The application of the back propagation of error algorithm to deep networks set off a revolution in artificial intelligence. This emphasizes the significant impact of Hinton's work on deep learning and its influence on the field of artificial intelligence.", 'ClearML is mentioned as the sponsor, an open source end-to-end ML ops solution. This highlights the sponsor of the conversation, providing information about an open source ML ops solution.']}, {'end': 313.892, 'start': 150.021, 'title': 'Backpropagation in neural networks', 'summary': "Discusses the challenges of applying backpropagation in neural networks, highlighting the need for a perfect model of the forward system, the issue of running recurrent nets backwards through time, and the lack of evidence supporting the brain's use of backpropagation.", 'duration': 163.871, 'highlights': ['Backpropagation requires a perfect model of the forward system and is challenging for recurrent nets, as running them backwards through time is problematic, especially for processing video. Running recurrent nets backwards through time to derive the gradients for backpropagation is particularly problematic for processing video, as it interrupts the pipelining of input through multiple stages of processing.', 'The lack of evidence supporting the notion that the brain uses backpropagation raises doubts about its applicability in neural networks. There is no evidence supporting the idea that the brain utilizes backpropagation, and the presence of technological challenges further complicates its integration into neural networks.', "Backpropagation involves determining how a change in weight affects the system's error and adjusting the weights accordingly to minimize the error. Backpropagation entails figuring out the gradient to understand the impact of weight changes on reducing the system's error, adjusting the weights in proportion to their contribution."]}], 'duration': 313.883, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ9.jpg', 'highlights': ['Jeffrey Hinton introduced the forward-forward algorithm as a more plausible model for how the cerebral cortex might learn, emphasizing its potential in understanding brain processes.', "The application of the back propagation of error algorithm to deep networks set off a revolution in artificial intelligence, emphasizing the significant impact of Hinton's work on deep learning and its influence on the field of artificial intelligence.", "The lack of evidence supporting the notion that the brain uses backpropagation raises doubts about its applicability in neural networks, highlighting the absence of supporting evidence and the doubts surrounding backpropagation's integration into neural networks.", 'Backpropagation requires a perfect model of the forward system and is challenging for recurrent nets, as running them backwards through time is problematic, especially for processing video, emphasizing the challenges and limitations of backpropagation in recurrent nets and video processing.', 'ClearML is mentioned as the sponsor, an open source end-to-end ML ops solution, providing information about an open source ML ops solution and highlighting the sponsor of the conversation.']}, {'end': 1260.689, 'segs': [{'end': 348.817, 'src': 'embed', 'start': 315.973, 'weight': 0, 'content': [{'end': 327.453, 'text': 'And so the idea of the forward algorithm is that if you can divide the learning, the process of getting the gradients you need,', 'start': 315.973, 'duration': 11.48}, {'end': 331.514, 'text': 'into two separate phases, you can do one of them online and one of them offline.', 'start': 327.453, 'duration': 4.061}, {'end': 338.055, 'text': 'And the one you do online can be very simple and will allow you to just pipeline stuff through.', 'start': 332.594, 'duration': 5.461}, {'end': 346.837, 'text': 'So the online phase, which is meant to correspond to Wake.', 'start': 339.655, 'duration': 7.182}, {'end': 348.817, 'text': 'you put input into the network.', 'start': 346.837, 'duration': 1.98}], 'summary': 'The forward algorithm allows for separate online and offline learning phases in neural networks.', 'duration': 32.844, 'max_score': 315.973, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ315973.jpg'}, {'end': 626.531, 'src': 'embed', 'start': 604.827, 'weight': 2, 'content': [{'end': 614.023, 'text': "So in all those phases, it's just trying to get higher activity in the hidden layers but only if it's not already got high activity.", 'start': 604.827, 'duration': 9.196}, {'end': 624.249, 'text': 'And you can predict like quarter of a million characters in the positive phase and then switch to the negative phase,', 'start': 616.945, 'duration': 7.304}, {'end': 626.531, 'text': "where the network's generating its own string of characters.", 'start': 624.249, 'duration': 2.282}], 'summary': 'Ai model generates quarter of a million characters, switching between positive and negative phases to increase hidden layer activity.', 'duration': 21.704, 'max_score': 604.827, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ604827.jpg'}, {'end': 854.616, 'src': 'heatmap', 'start': 636.857, 'weight': 0.781, 'content': [{'end': 638.358, 'text': "It's looking at a little window of characters.", 'start': 636.857, 'duration': 1.501}, {'end': 643.071, 'text': 'And then you run for a quarter of a million characters like that.', 'start': 640.809, 'duration': 2.262}, {'end': 645.852, 'text': "And it doesn't actually have to be the same number anymore.", 'start': 643.471, 'duration': 2.381}, {'end': 649.755, 'text': "With Boltzmann machines, it's very important to have the same number of things in the positive phase and negative phase.", 'start': 646.192, 'duration': 3.563}, {'end': 650.855, 'text': "But with this, it isn't.", 'start': 650.095, 'duration': 0.76}, {'end': 664.584, 'text': "And what's remarkable is that up to a few hundred thousand predictions it works almost as well if you separate the phases as opposed to interleaving.", 'start': 653.317, 'duration': 11.267}, {'end': 666.865, 'text': "And that's quite surprising.", 'start': 665.904, 'duration': 0.961}, {'end': 670.067, 'text': 'In human learning.', 'start': 668.446, 'duration': 1.621}, {'end': 680.113, 'text': "Certainly there's wake and sleep for complicated concepts that you're learning, but there's learning going on all the time.", 'start': 672.307, 'duration': 7.806}, {'end': 683.416, 'text': "that doesn't require a sleep phase.", 'start': 680.113, 'duration': 3.303}, {'end': 684.857, 'text': 'There is in this too.', 'start': 683.956, 'duration': 0.901}, {'end': 687.919, 'text': "If you're just running on positive examples,", 'start': 685.017, 'duration': 2.902}, {'end': 694.684, 'text': "it's changing the weights for all the examples where it's not completely obvious that this is positive data.", 'start': 687.919, 'duration': 6.765}, {'end': 699.348, 'text': 'So it does a lot of learning in the positive phase.', 'start': 696.105, 'duration': 3.243}, {'end': 704.615, 'text': 'But if you go on too long, it fails catastrophically.', 'start': 701.51, 'duration': 3.105}, {'end': 707.218, 'text': 'And people seem to be the same.', 'start': 705.997, 'duration': 1.221}, {'end': 710.664, 'text': "If I deprive you of sleep for a week, you'll go completely psychotic.", 'start': 707.259, 'duration': 3.405}, {'end': 715.508, 'text': "and you'll have hallucinations and you may never recover.", 'start': 712.085, 'duration': 3.423}, {'end': 717.269, 'text': 'can you explain?', 'start': 715.508, 'duration': 1.761}, {'end': 726.877, 'text': 'I think one thing that people are having trouble non-practitioners are having trouble understanding is the concept of negative data.', 'start': 717.269, 'duration': 9.608}, {'end': 735.544, 'text': "I've seen a few articles where they just put it in quotation marks out of your paper, which indicates that they don't understand them.", 'start': 726.877, 'duration': 8.667}, {'end': 744.3, 'text': "okay, What I mean by negative data is data that you give to the system when it's running in the negative phase, that is,", 'start': 735.544, 'duration': 8.756}, {'end': 747.202, 'text': "when it's trying to get low activity in all the hidden layers.", 'start': 744.3, 'duration': 2.902}, {'end': 752.187, 'text': 'And there are many ways of generating negative data.', 'start': 749.564, 'duration': 2.623}, {'end': 755.51, 'text': "In the end, you'd like the model itself to generate the negative data.", 'start': 752.567, 'duration': 2.943}, {'end': 758.732, 'text': 'So this is just like it was in Boltzmann machines.', 'start': 756.791, 'duration': 1.941}, {'end': 766.698, 'text': "The data that the model itself generates is negative data and real data is what you're trying to model.", 'start': 758.973, 'duration': 7.725}, {'end': 773.342, 'text': "And once you've got a really good model, the negative data looks just like real data, so no learning can take place.", 'start': 768.059, 'duration': 5.283}, {'end': 777.144, 'text': "But negative data doesn't have to be produced by the model.", 'start': 774.523, 'duration': 2.621}, {'end': 790.883, 'text': 'So, for example, you can train it to do supervised learning by inputting both an image and the label.', 'start': 777.724, 'duration': 13.159}, {'end': 800.827, 'text': "so now the label is part of the input, not part of the output, and what you're asking it to do is, when i input an image with the correct label,", 'start': 790.883, 'duration': 9.944}, {'end': 802.708, 'text': "that's going to be the positive data.", 'start': 800.827, 'duration': 1.881}, {'end': 804.569, 'text': 'you want to have high activity.', 'start': 802.708, 'duration': 1.861}, {'end': 813.072, 'text': "when i input an image with the incorrect label, which i just put in by hand as the incorrect, as an incorrect label, that's negative data.", 'start': 804.569, 'duration': 8.503}, {'end': 820.698, 'text': "now it works best if you get the model to predict the label and you put in the best of the model's predictions.", 'start': 813.072, 'duration': 7.626}, {'end': 821.578, 'text': "it's not correct.", 'start': 820.698, 'duration': 0.88}, {'end': 827.1, 'text': "Because then you're giving it the mistakes it's most likely to make as negative data.", 'start': 821.598, 'duration': 5.502}, {'end': 831.221, 'text': 'But you can put in negative data by hand and it works fine.', 'start': 828.82, 'duration': 2.401}, {'end': 836.842, 'text': 'And the reconciliation.', 'start': 833.081, 'duration': 3.761}, {'end': 846.105, 'text': "then at the end is it as in Boltzmann machines, where you're subtracting the negative data from the positive data?", 'start': 836.842, 'duration': 9.263}, {'end': 854.616, 'text': 'In both machines, what you do is you give it positive data, real data, and you let it settle to equilibrium,', 'start': 847.054, 'duration': 7.562}], 'summary': 'The model can learn from positive and negative data, and sleep is not always required for learning, but prolonged deprivation can lead to catastrophic failure, similar to human response to sleep deprivation.', 'duration': 217.759, 'max_score': 636.857, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ636857.jpg'}, {'end': 683.416, 'src': 'embed', 'start': 650.095, 'weight': 1, 'content': [{'end': 650.855, 'text': "But with this, it isn't.", 'start': 650.095, 'duration': 0.76}, {'end': 664.584, 'text': "And what's remarkable is that up to a few hundred thousand predictions it works almost as well if you separate the phases as opposed to interleaving.", 'start': 653.317, 'duration': 11.267}, {'end': 666.865, 'text': "And that's quite surprising.", 'start': 665.904, 'duration': 0.961}, {'end': 670.067, 'text': 'In human learning.', 'start': 668.446, 'duration': 1.621}, {'end': 680.113, 'text': "Certainly there's wake and sleep for complicated concepts that you're learning, but there's learning going on all the time.", 'start': 672.307, 'duration': 7.806}, {'end': 683.416, 'text': "that doesn't require a sleep phase.", 'start': 680.113, 'duration': 3.303}], 'summary': 'Up to a few hundred thousand predictions, separating phases works almost as well as interleaving, a surprising finding in human learning.', 'duration': 33.321, 'max_score': 650.095, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ650095.jpg'}, {'end': 1064.008, 'src': 'heatmap', 'start': 881.175, 'weight': 0.722, 'content': [{'end': 886.359, 'text': 'And you measure the same statistics and you take the difference of those pairwise statistics,', 'start': 881.175, 'duration': 5.184}, {'end': 888.901, 'text': 'and that is the correct learning signal for a Boltzmann machine.', 'start': 886.359, 'duration': 2.542}, {'end': 891.343, 'text': 'But the problem is you have to let the model settle.', 'start': 889.481, 'duration': 1.862}, {'end': 893.445, 'text': "And there just isn't time for that.", 'start': 892.164, 'duration': 1.281}, {'end': 897.808, 'text': 'Also, you have to have all sorts of other conditions, like the connections have to be symmetric.', 'start': 894.245, 'duration': 3.563}, {'end': 900.25, 'text': "There's no evidence connections in the brain are symmetric.", 'start': 898.189, 'duration': 2.061}, {'end': 913.793, 'text': 'Can you give a concrete example of of positive and negative data in a very simple learning exercise.', 'start': 901.791, 'duration': 12.002}, {'end': 915.294, 'text': 'You were working on digits.', 'start': 913.893, 'duration': 1.401}, {'end': 919.917, 'text': "This example, I think, is if you're predicting a string of characters.", 'start': 916.535, 'duration': 3.382}, {'end': 925.801, 'text': "The positive data, you'd see a little window of characters and you'd have some hidden layers.", 'start': 920.318, 'duration': 5.483}, {'end': 931.926, 'text': "And because that's a positive window of characters, you try and make the activity high in all the hidden layers.", 'start': 926.962, 'duration': 4.964}, {'end': 940.386, 'text': 'But also from those hidden layers, from the activity in those hidden layers, you would try to predict the next character.', 'start': 932.981, 'duration': 7.405}, {'end': 943.048, 'text': "That's a very simple generative model.", 'start': 941.687, 'duration': 1.361}, {'end': 947.071, 'text': "But notice the generative model isn't having to learn its own representations.", 'start': 943.888, 'duration': 3.183}, {'end': 953.915, 'text': 'The representations are learned just to make positive strings of characters give you high activity in all the hidden layers.', 'start': 947.111, 'duration': 6.804}, {'end': 955.696, 'text': "That's the objective of the learning.", 'start': 954.416, 'duration': 1.28}, {'end': 957.778, 'text': "The objective isn't to predict the next character.", 'start': 956.117, 'duration': 1.661}, {'end': 966.27, 'text': 'But having done that learning and got the right representations for these strings of characters, these windows of characters,', 'start': 959.246, 'duration': 7.024}, {'end': 968.591, 'text': 'you also learn to predict the next character.', 'start': 966.27, 'duration': 2.321}, {'end': 972.393, 'text': "And that's what you're doing in the positive phase.", 'start': 970.432, 'duration': 1.961}, {'end': 975.175, 'text': 'Seeing windows of characters.', 'start': 973.494, 'duration': 1.681}, {'end': 980.718, 'text': "you're changing the weights so that all the hidden layers have high activity for those windows of characters,", 'start': 975.175, 'duration': 5.543}, {'end': 987.4, 'text': "but you're also changing top-down weights that are trying to predict the next character from the activity in the hidden layers.", 'start': 980.718, 'duration': 6.682}, {'end': 990.802, 'text': "That's what's sometimes called a linear classifier.", 'start': 988.24, 'duration': 2.562}, {'end': 994.545, 'text': "So that's the positive phase.", 'start': 990.822, 'duration': 3.723}, {'end': 1002.832, 'text': 'In the negative phase, you As input, you use characters that have been predicted already.', 'start': 994.946, 'duration': 7.886}, {'end': 1012.674, 'text': "So you've got this window and you're going along and just predicting the next character and then moving the window along one to include the next character you predicted and to drop off the oldest character.", 'start': 1003.252, 'duration': 9.422}, {'end': 1014.554, 'text': 'You just keep going like that.', 'start': 1013.334, 'duration': 1.22}, {'end': 1021.836, 'text': "And for each of those frames, you try and get low activity in the hidden areas because it's negative data.", 'start': 1015.454, 'duration': 6.382}, {'end': 1031.952, 'text': 'And I think you can see that if your predictions were perfect and you start from a string, a real string,', 'start': 1023.836, 'duration': 8.116}, {'end': 1038.156, 'text': "then what's happening in the negative phase will be exactly like what's happening in the positive phase.", 'start': 1031.952, 'duration': 6.204}, {'end': 1040.339, 'text': 'And so the two will cancel out.', 'start': 1039.018, 'duration': 1.321}, {'end': 1050.308, 'text': "But if there's a difference, then you'll be learning to make things more like the positive phase and less like the negative phase.", 'start': 1041.741, 'duration': 8.567}, {'end': 1053.111, 'text': "And so it'll get better and better at predicting.", 'start': 1051.489, 'duration': 1.622}, {'end': 1062.146, 'text': 'As I understood, backpropagation on static data.', 'start': 1055.66, 'duration': 6.486}, {'end': 1064.008, 'text': 'there are inputs,', 'start': 1062.146, 'duration': 1.862}], 'summary': 'Learning signal for boltzmann machine involves settling model and symmetric connections; positive phase aims to generate high activity in hidden layers for given characters while negative phase aims to reduce activity for predicted characters.', 'duration': 182.833, 'max_score': 881.175, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ881175.jpg'}], 'start': 315.973, 'title': 'Learning phases and discrimination', 'summary': 'Covers the forward algorithm dividing learning into online and offline phases to achieve specific objectives, logistic functions, and discrimination between real and fake data. it discusses boltzmann machine learning, wake-sleep algorithm, positive and negative data, and their impact.', 'chapters': [{'end': 538.063, 'start': 315.973, 'title': 'Forward algorithm and training phases', 'summary': 'Discusses the forward algorithm which divides learning into online and offline phases, aiming to achieve high activity for real data and low activity for fake data at each layer, with specific objectives and logistic functions. it further explains the positive and negative phases for the network to learn a generative model and discriminate between real and fake data.', 'duration': 222.09, 'highlights': ['The forward algorithm divides learning into online and offline phases, aiming to achieve high activity for real data and low activity for fake data at each layer, with specific objectives and logistic functions.', 'The online phase focuses on achieving high sum of squared activities in every layer to identify real data, while the offline phase aims to generate its own data and have low activity in every layer to learn a generative model and discriminate between real and fake data.', 'The discriminative net and the generative model use the same hidden units, overcoming problems faced by generative adversarial networks.']}, {'end': 900.25, 'start': 538.904, 'title': 'Boltzmann machine learning', 'summary': 'Discusses the wake-sleep algorithm used in boltzmann machine learning, explaining the phases, the use of positive and negative data, and the surprising results of separating the phases, with a mention of the difference from human learning and the concept of negative data.', 'duration': 361.346, 'highlights': ['The wake-sleep algorithm separates the phases and yields surprising results when compared to interleaving. The algorithm separates the phases and surprisingly shows that it works almost as well when separating the phases as opposed to interleaving.', "Explanation of negative data and its role in the negative phase of the model. The concept of negative data and its role in the negative phase is explained, emphasizing the generation of negative data and its importance in the model's learning process.", "The use of positive and negative data in supervised learning and the reconciliation process. The utilization of positive and negative data in supervised learning is explained, along with the reconciliation process, emphasizing the effectiveness of using negative data in the model's learning."]}, {'end': 1260.689, 'start': 901.791, 'title': 'Learning positive and negative data', 'summary': 'Discusses a simple generative model that learns to predict characters by ensuring high activity in hidden layers for positive data and low activity for negative data, with the insight that backpropagation models are not suitable for the brain due to the absence of evidence of information flowing backward through neurons.', 'duration': 358.898, 'highlights': ['The positive phase involves ensuring high activity in all hidden layers for positive strings of characters and learning to predict the next character, while the negative phase focuses on achieving low activity in the hidden layers for predicted characters. Positive phase involves high activity in hidden layers, learning to predict next character; Negative phase involves low activity in hidden layers for predicted characters.', "The insight that backpropagation on static data is not a good model for the brain due to the absence of evidence of information flowing backward through neurons, and the explanation of the brain's top-down connections in the perceptual system. Backpropagation on static data not suitable for brain; Lack of evidence for information flowing backward through neurons; Description of brain's top-down connections in the perceptual system.", "The concept of turning a static image into a 'boring' video to enable top-down effects and the explanation of how each layer can receive inputs from a higher layer in the previous time step. Turning static image into 'boring' video for top-down effects; Explanation of how each layer can receive inputs from a higher layer in the previous time step."]}], 'duration': 944.716, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ315973.jpg', 'highlights': ['The forward algorithm divides learning into online and offline phases, aiming to achieve high activity for real data and low activity for fake data at each layer, with specific objectives and logistic functions.', 'The wake-sleep algorithm separates the phases and yields surprising results when compared to interleaving. The algorithm separates the phases and surprisingly shows that it works almost as well when separating the phases as opposed to interleaving.', 'The positive phase involves ensuring high activity in all hidden layers for positive strings of characters and learning to predict the next character, while the negative phase focuses on achieving low activity in the hidden layers for predicted characters.']}, {'end': 1980.829, 'segs': [{'end': 1313.68, 'src': 'embed', 'start': 1285.463, 'weight': 0, 'content': [{'end': 1289.346, 'text': 'So typical LSTMs and so on would have one hidden layer.', 'start': 1285.463, 'duration': 3.883}, {'end': 1295.23, 'text': 'And then Alex Graves pioneered the idea of having multiple hidden layers and showed that it was a winner.', 'start': 1289.846, 'duration': 5.384}, {'end': 1300.613, 'text': "So that idea has been around, but it's always been paired with backpropagation as the learning algorithm.", 'start': 1296.25, 'duration': 4.363}, {'end': 1308.477, 'text': 'And in that case it was backpropagation through time, which was completely unrealistic, by no And the brain.', 'start': 1300.973, 'duration': 7.504}, {'end': 1313.68, 'text': "real life is not static, so you're not perceiving in a truly static fashion.", 'start': 1308.477, 'duration': 5.203}], 'summary': 'Alex graves pioneered multiple hidden layers in lstms, showing it was a winning idea in contrast to backpropagation through time.', 'duration': 28.217, 'max_score': 1285.463, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ1285463.jpg'}, {'end': 1362.305, 'src': 'embed', 'start': 1334.588, 'weight': 3, 'content': [{'end': 1339.831, 'text': "In fact, there's something a bit like SimClear that Sue Becker and I published in about 1992 in Nature.", 'start': 1334.588, 'duration': 5.243}, {'end': 1343.974, 'text': "But we didn't use negative examples.", 'start': 1341.472, 'duration': 2.502}, {'end': 1349.117, 'text': "We tried to analytically compute the negative phase and that wasn't, there was a mistake.", 'start': 1344.034, 'duration': 5.083}, {'end': 1350.878, 'text': 'It just, that would never work.', 'start': 1349.377, 'duration': 1.501}, {'end': 1357.002, 'text': 'Once you start using negative examples, then you get things like SimClear.', 'start': 1352.879, 'duration': 4.123}, {'end': 1362.305, 'text': "And I discovered that you could separate the phases that you didn't.", 'start': 1358.582, 'duration': 3.723}], 'summary': 'Simclear, a model published in 1992, used negative examples to separate phases.', 'duration': 27.717, 'max_score': 1334.588, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ1334588.jpg'}, {'end': 1844.452, 'src': 'embed', 'start': 1805.95, 'weight': 2, 'content': [{'end': 1813.896, 'text': "That's very good, for example, making smooth videos from frames that are taken at quite long time intervals.", 'start': 1805.95, 'duration': 7.946}, {'end': 1818.96, 'text': 'But in the forward-to-forward algorithm.', 'start': 1814.737, 'duration': 4.223}, {'end': 1822.303, 'text': "what's your intuition that this is?", 'start': 1818.96, 'duration': 3.343}, {'end': 1831.329, 'text': 'if indeed everything works out, that this is a model for information processing in the cerebral cortex,', 'start': 1822.303, 'duration': 9.026}, {'end': 1840.131, 'text': 'and that perception of depth and the 3D nature of reality would emerge?', 'start': 1831.329, 'duration': 8.802}, {'end': 1844.452, 'text': "Yes, that's the hope, yes.", 'start': 1841.051, 'duration': 3.401}], 'summary': 'Exploring a model for information processing in the cerebral cortex to perceive depth and 3d reality.', 'duration': 38.502, 'max_score': 1805.95, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ1805950.jpg'}], 'start': 1260.689, 'title': 'Evolving hidden layers and information processing', 'summary': 'Covers the evolution of hidden layers in recurrent nets, pioneered by alex graves, and the unrealistic nature of backpropagation through time. it also delves into the comparison of simclear with other models like sinclair and the potential application of the forward-forward algorithm as a model for information processing in the cerebral cortex.', 'chapters': [{'end': 1308.477, 'start': 1260.689, 'title': 'Multi-layer hidden layers in recurrent nets', 'summary': 'Discusses the evolution of hidden layers in recurrent nets, from a single layer to multiple layers as pioneered by alex graves, and the unrealistic nature of backpropagation through time as a learning algorithm.', 'duration': 47.788, 'highlights': ['Alex Graves pioneered the idea of having multiple hidden layers in recurrent nets.', 'Typical LSTMs and other recurrent nets used to have only one hidden layer.', 'Backpropagation through time was considered completely unrealistic for learning in the brain.']}, {'end': 1980.829, 'start': 1308.477, 'title': 'Information processing in cerebral cortex', 'summary': 'Discusses the concept of simclear, its comparison with other models like sinclair, and the potential application of the forward-forward algorithm as a model for information processing in the cerebral cortex, with the hope of representing 3d structure in hidden layers.', 'duration': 672.352, 'highlights': ['The forward-forward algorithm as a model for information processing in the cerebral cortex The algorithm is considered as a potential model for information processing in the cerebral cortex, with the hope of representing 3D structure in hidden layers.', "Comparison of SimClear with other models like Sinclair SimClear is discussed in comparison to other models like Sinclair and NGRAD's activity differences, highlighting the use of negative examples to measure agreement.", 'Potential application of forward-forward algorithm for capturing 3D nature of reality There is potential for the forward-forward algorithm to capture the perception of depth and the 3D nature of reality, particularly in representing 3D structure in hidden layers.']}], 'duration': 720.14, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ1260689.jpg', 'highlights': ['Alex Graves pioneered the idea of having multiple hidden layers in recurrent nets.', 'Backpropagation through time was considered completely unrealistic for learning in the brain.', 'The forward-forward algorithm is considered as a potential model for information processing in the cerebral cortex, with the hope of representing 3D structure in hidden layers.', "Comparison of SimClear with other models like Sinclair and NGRAD's activity differences, highlighting the use of negative examples to measure agreement.", 'Potential application of forward-forward algorithm for capturing 3D nature of reality, particularly in representing 3D structure in hidden layers.']}, {'end': 2603.365, 'segs': [{'end': 2130.474, 'src': 'heatmap', 'start': 2083.215, 'weight': 0.782, 'content': [{'end': 2088.297, 'text': "And then when they see things that don't fit the physics, they'll have high activity.", 'start': 2083.215, 'duration': 5.082}, {'end': 2089.337, 'text': "That'll be the negative data.", 'start': 2088.337, 'duration': 1}, {'end': 2091.378, 'text': "So that's called a constraint.", 'start': 2090.197, 'duration': 1.181}, {'end': 2099.176, 'text': "And so if you make your objective function B, have low activity for real things and high activity for things that aren't real,", 'start': 2092.911, 'duration': 6.265}, {'end': 2102.338, 'text': "you'll find constraints in the data as opposed to features.", 'start': 2099.176, 'duration': 3.162}, {'end': 2106.98, 'text': 'So features are things that have high variance and constraints are things that have low variance.', 'start': 2103.278, 'duration': 3.702}, {'end': 2111.803, 'text': "A feature is something that's got higher variance than it should and constraint has lower variance than it should.", 'start': 2108.021, 'duration': 3.782}, {'end': 2116.486, 'text': "Now, there's no reason why you shouldn't have two types of neurons.", 'start': 2112.463, 'duration': 4.023}, {'end': 2118.687, 'text': "One's looking for features and one's looking for constraints.", 'start': 2116.606, 'duration': 2.081}, {'end': 2121.449, 'text': 'And we know, with just linear models,', 'start': 2119.848, 'duration': 1.601}, {'end': 2130.474, 'text': 'that a method like principal components analysis looks for the directions in the space that have the highest variance.', 'start': 2122.591, 'duration': 7.883}], 'summary': 'Identifying constraints and features using high and low activity data and variance levels for two types of neurons.', 'duration': 47.259, 'max_score': 2083.215, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2083215.jpg'}, {'end': 2184.807, 'src': 'embed', 'start': 2150.888, 'weight': 0, 'content': [{'end': 2154.271, 'text': 'And so that, for example, is a direction that might make things work better.', 'start': 2150.888, 'duration': 3.383}, {'end': 2156.273, 'text': "but there's lots.", 'start': 2154.271, 'duration': 2.002}, {'end': 2157.774, 'text': "there's about 20 things like that.", 'start': 2156.273, 'duration': 1.501}, {'end': 2158.655, 'text': 'I need to investigate.', 'start': 2157.774, 'duration': 0.881}, {'end': 2169.725, 'text': "And my feeling is, until I've got a good recipe for whether you should use features or constraints or both,", 'start': 2159.496, 'duration': 10.229}, {'end': 2175.13, 'text': "what's the most effective way to generate negative data and so on is premature to investigate really big systems.", 'start': 2169.725, 'duration': 5.405}, {'end': 2184.807, 'text': 'with regard to really big systems, one of the things you talk about is the need for a new kind of computer.', 'start': 2176.283, 'duration': 8.524}], 'summary': 'Need to investigate 20 directions for better system performance and new computer for big systems.', 'duration': 33.919, 'max_score': 2150.888, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2150888.jpg'}, {'end': 2250.345, 'src': 'embed', 'start': 2205.734, 'weight': 1, 'content': [{'end': 2214.077, 'text': 'This is for things that, where we want computers to be like people, to process natural language, to process vision.', 'start': 2205.734, 'duration': 8.343}, {'end': 2221.04, 'text': "All those things that some years ago Bill Gates said computers couldn't do, like they're blind and deaf.", 'start': 2214.877, 'duration': 6.163}, {'end': 2230.383, 'text': "They're not blind and deaf anymore, but for processing natural language or doing motor control or doing common sense reasoning.", 'start': 2221.88, 'duration': 8.503}, {'end': 2236.041, 'text': 'We probably want a different kind of computer if we want to do it with very low energy.', 'start': 2231.98, 'duration': 4.061}, {'end': 2239.582, 'text': 'We need to make much better use of all the properties of the hardware.', 'start': 2237.021, 'duration': 2.561}, {'end': 2243.263, 'text': 'Your interest is understanding the brain.', 'start': 2240.162, 'duration': 3.101}, {'end': 2246.304, 'text': 'But I have a side interest in getting low energy computation going.', 'start': 2243.303, 'duration': 3.001}, {'end': 2250.345, 'text': "And the point about the forward-forward is it works when you don't have a good model of the hardware.", 'start': 2246.764, 'duration': 3.581}], 'summary': 'Computers can now process natural language and vision, with a need for low energy computation.', 'duration': 44.611, 'max_score': 2205.734, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2205734.jpg'}, {'end': 2307.213, 'src': 'embed', 'start': 2272.919, 'weight': 3, 'content': [{'end': 2276.142, 'text': 'It learns something different because the black box is changing what happens on the forward path.', 'start': 2272.919, 'duration': 3.223}, {'end': 2280.642, 'text': "But the point is, it's changing it in exactly the same way for both forward passes.", 'start': 2277.401, 'duration': 3.241}, {'end': 2281.722, 'text': 'So it all cancels out.', 'start': 2280.702, 'duration': 1.02}, {'end': 2285.364, 'text': "Whereas in back propagation, you're completely sunk if there's a black box.", 'start': 2282.563, 'duration': 2.801}, {'end': 2289.145, 'text': 'The best you can do is try and learn a differentiable model of the black box.', 'start': 2285.644, 'duration': 3.501}, {'end': 2292.666, 'text': "And that's not going to be very good if the black box is wandering in its behavior.", 'start': 2289.425, 'duration': 3.241}, {'end': 2298.565, 'text': "So the forward algorithm doesn't need to have a perfect model of the forward system.", 'start': 2293.78, 'duration': 4.785}, {'end': 2307.213, 'text': 'It needs to have a good enough model of what one neuron is doing so that it can change the incoming weights of that neuron to make it more active or less active.', 'start': 2298.965, 'duration': 8.248}], 'summary': 'In back propagation, a black box can hinder learning, while the forward algorithm requires a good enough model of individual neurons to adjust weights.', 'duration': 34.294, 'max_score': 2272.919, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2272919.jpg'}, {'end': 2440.038, 'src': 'heatmap', 'start': 2375.868, 'weight': 4, 'content': [{'end': 2383.073, 'text': 'And if it did scale up very well to the degree that large language models have been successful,', 'start': 2375.868, 'duration': 7.205}, {'end': 2392.06, 'text': 'do you think that its abilities would eclipse those of models based on backpropagation?', 'start': 2383.073, 'duration': 8.987}, {'end': 2393.72, 'text': "I'm not at all sure.", 'start': 2392.74, 'duration': 0.98}, {'end': 2394.961, 'text': 'I think they may not.', 'start': 2393.74, 'duration': 1.221}, {'end': 2401.422, 'text': 'So I think backpropagation might be a better algorithm in the sense that for a given number of connections,', 'start': 2395.661, 'duration': 5.761}, {'end': 2406.324, 'text': 'you can get more knowledge into those connections using backpropagation than you can with the forward algorithm.', 'start': 2401.422, 'duration': 4.902}, {'end': 2413.606, 'text': "So the networks with forward work better if they're somewhat bigger than the best-sized networks for backpropagation.", 'start': 2407.244, 'duration': 6.362}, {'end': 2418.227, 'text': "It's not good at squeezing a lot of information into a few connections.", 'start': 2414.545, 'duration': 3.682}, {'end': 2423.009, 'text': 'Backpropagation will squeeze lots of information into a few connections if you force it to.', 'start': 2419.067, 'duration': 3.942}, {'end': 2428.052, 'text': "It's much more happy not having to do that, but it'll do it if you force it to.", 'start': 2424.31, 'duration': 3.742}, {'end': 2430.113, 'text': "And the forward algorithm isn't good at that.", 'start': 2428.432, 'duration': 1.681}, {'end': 2440.038, 'text': 'So if you take these large language models, but take something with a trillion connections, which is about the largest language model,', 'start': 2431.733, 'duration': 8.305}], 'summary': 'Backpropagation may be better at squeezing knowledge into connections than the forward algorithm.', 'duration': 47.141, 'max_score': 2375.868, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2375868.jpg'}, {'end': 2562.604, 'src': 'embed', 'start': 2532.328, 'weight': 6, 'content': [{'end': 2535.572, 'text': 'and Oh, I see the potential for reasoning sure.', 'start': 2532.328, 'duration': 3.244}, {'end': 2539.213, 'text': 'But consciousness is a different kind of question.', 'start': 2537.152, 'duration': 2.061}, {'end': 2547.537, 'text': "So I think people, I'm amazed that anybody thinks they understand what they're talking about when they talk about consciousness.", 'start': 2539.233, 'duration': 8.304}, {'end': 2550.618, 'text': 'They talk about it as if we can define it.', 'start': 2548.538, 'duration': 2.08}, {'end': 2554.26, 'text': "And it's really a jumble of a whole bunch of different concepts.", 'start': 2551.299, 'duration': 2.961}, {'end': 2562.604, 'text': "And they're all mixed together into this attempt to explain a really complicated mechanism in terms of an essence.", 'start': 2555.221, 'duration': 7.383}], 'summary': 'Challenging the understanding of consciousness and the complexity of defining it.', 'duration': 30.276, 'max_score': 2532.328, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2532328.jpg'}], 'start': 1981.329, 'title': 'Challenges in scaling and refining algorithms and forward algorithm and black box in neural nets', 'summary': 'Discusses the challenges of scaling algorithms, emphasizing the importance of refining basic properties and investigating various aspects before scaling, such as generating negative data effectively and determining the most effective way to use features or constraints in big systems. it also covers the potential of the forward algorithm in neural nets, challenges posed by black boxes, and the comparison between forward algorithm and backpropagation, stressing on the potential of reasoning and addressing consciousness in neural nets.', 'chapters': [{'end': 2250.345, 'start': 1981.329, 'title': 'Challenges in scaling and refining algorithms', 'summary': 'Discusses the challenges of scaling algorithms, emphasizing the importance of refining basic properties and investigating various aspects before scaling, such as generating negative data effectively and determining the most effective way to use features or constraints in big systems.', 'duration': 269.016, 'highlights': ['The importance of refining basic properties and investigating various aspects before scaling, such as generating negative data effectively and determining the most effective way to use features or constraints in big systems. N/A', 'The need for a new kind of computer for processing natural language, vision, motor control, and common sense reasoning with low energy consumption. N/A', 'Exploring the potential of low energy computation and the significance of making better use of hardware properties for efficient processing. N/A']}, {'end': 2603.365, 'start': 2250.505, 'title': 'Forward algorithm and black box in neural nets', 'summary': 'Discusses the potential of the forward algorithm in neural nets and the challenges posed by black boxes, as well as the comparison between forward algorithm and backpropagation, stressing on the potential of reasoning and addressing consciousness in neural nets.', 'duration': 352.86, 'highlights': ['The forward algorithm in neural nets can learn effectively even with a black box inserted in a layer, as it changes the forward path in the same way for both forward passes, thus cancelling out its effect. The forward algorithm in neural nets can learn effectively even with a black box inserted in a layer, as it changes the forward path in the same way for both forward passes, thus cancelling out its effect.', "Speculation about low power computer architecture handling forward algorithms and scaling up, potentially competing with large language models based on backpropagation, but may not eclipse their abilities due to the latter's capacity to squeeze more knowledge into connections. Speculation about low power computer architecture handling forward algorithms and scaling up, potentially competing with large language models based on backpropagation, but may not eclipse their abilities due to the latter's capacity to squeeze more knowledge into connections.", 'Comparison between forward algorithm and backpropagation indicates that backpropagation can squeeze more information into connections for a given number of connections, making it better suited for networks with a smaller number of connections, while forward algorithm works better for somewhat larger networks. Comparison between forward algorithm and backpropagation indicates that backpropagation can squeeze more information into connections for a given number of connections, making it better suited for networks with a smaller number of connections, while forward algorithm works better for somewhat larger networks.', 'The potential for reasoning is acknowledged in the forward algorithm, but consciousness is considered a different kind of question, with the complexity of defining consciousness likened to the historical challenge of defining vital force in the context of life. The potential for reasoning is acknowledged in the forward algorithm, but consciousness is considered a different kind of question, with the complexity of defining consciousness likened to the historical challenge of defining vital force in the context of life.']}], 'duration': 622.036, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ1981329.jpg', 'highlights': ['The importance of refining basic properties and investigating various aspects before scaling, such as generating negative data effectively and determining the most effective way to use features or constraints in big systems.', 'The need for a new kind of computer for processing natural language, vision, motor control, and common sense reasoning with low energy consumption.', 'Exploring the potential of low energy computation and the significance of making better use of hardware properties for efficient processing.', 'The forward algorithm in neural nets can learn effectively even with a black box inserted in a layer, as it changes the forward path in the same way for both forward passes, thus cancelling out its effect.', "Speculation about low power computer architecture handling forward algorithms and scaling up, potentially competing with large language models based on backpropagation, but may not eclipse their abilities due to the latter's capacity to squeeze more knowledge into connections.", 'Comparison between forward algorithm and backpropagation indicates that backpropagation can squeeze more information into connections for a given number of connections, making it better suited for networks with a smaller number of connections, while forward algorithm works better for somewhat larger networks.', 'The potential for reasoning is acknowledged in the forward algorithm, but consciousness is considered a different kind of question, with the complexity of defining consciousness likened to the historical challenge of defining vital force in the context of life.']}, {'end': 2967.136, 'segs': [{'end': 2635.05, 'src': 'embed', 'start': 2603.365, 'weight': 0, 'content': [{'end': 2613.411, 'text': "it's just that it's not a useful concept, because it's an attempt to explain something complicated in terms of some simple essence.", 'start': 2603.365, 'duration': 10.046}, {'end': 2621.396, 'text': 'so another model like that is So sports cars have oomph and some have a lot of oomph.', 'start': 2613.411, 'duration': 7.985}, {'end': 2629.8, 'text': 'Like an Aston Martin with big noisy exhausts and lots of acceleration and bucket seats has lots of oomph.', 'start': 2622.357, 'duration': 7.443}, {'end': 2635.05, 'text': 'Umph is an intuitive concept.', 'start': 2632.87, 'duration': 2.18}], 'summary': 'Oomph is an intuitive concept for explaining sports car performance.', 'duration': 31.685, 'max_score': 2603.365, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2603365.jpg'}, {'end': 2727.334, 'src': 'embed', 'start': 2703.234, 'weight': 2, 'content': [{'end': 2709.757, 'text': 'I think it would be nice to show them the way out of the trap they make for themselves, which is, I think,', 'start': 2703.234, 'duration': 6.523}, {'end': 2718.081, 'text': 'most people have a radical misunderstanding of how terms about perception and experience and sensation and feelings actually work,', 'start': 2709.757, 'duration': 8.324}, {'end': 2719.142, 'text': 'of how the language works.', 'start': 2718.081, 'duration': 1.061}, {'end': 2727.334, 'text': "If, for example, I say I'm seeing a pink elephant, notice the words pink and elephant refer to things in the world.", 'start': 2720.31, 'duration': 7.024}], 'summary': 'Most people have a radical misunderstanding of perception and sensation terms.', 'duration': 24.1, 'max_score': 2703.234, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2703234.jpg'}, {'end': 2836.678, 'src': 'embed', 'start': 2804.31, 'weight': 3, 'content': [{'end': 2812.373, 'text': "I'm giving you a hypothetical statement, but if this hypothetical thing were out there in the world, that would explain this brain state.", 'start': 2804.31, 'duration': 8.063}, {'end': 2816.915, 'text': "And so I'm giving you insight into my brain state by talking about a hypothetical world.", 'start': 2812.753, 'duration': 4.162}, {'end': 2821.166, 'text': "What's not real about experience is that it's a hypothetical I'm giving you.", 'start': 2817.884, 'duration': 3.282}, {'end': 2823.708, 'text': "It's not that it lives in some other spooky world.", 'start': 2821.687, 'duration': 2.021}, {'end': 2825.97, 'text': "And it's the same for feelings.", 'start': 2824.849, 'duration': 1.121}, {'end': 2836.678, 'text': "If I say, I feel like hitting you, what I'm doing is I'm giving you a sense of what's going on in my head.", 'start': 2827.231, 'duration': 9.447}], 'summary': 'Hypothetical examples are used to explain brain states and feelings.', 'duration': 32.368, 'max_score': 2804.31, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2804310.jpg'}], 'start': 2603.365, 'title': "Perceptions and 'umph'", 'summary': "Explores the concept of 'umph' in sports cars and its limitations, as well as delves into the misconception around perception and experience, offering insights into the internal states and brain processes of individuals.", 'chapters': [{'end': 2702.434, 'start': 2603.365, 'title': "Understanding the concept of 'umph'", 'summary': "Discusses the concept of 'umph' in relation to sports cars, highlighting the intuitive nature of 'umph' and its limitations in explaining complex mechanisms, while also drawing parallels to the concept of consciousness and its perception by humans.", 'duration': 99.069, 'highlights': ["The concept of 'umph' is illustrated through the example of sports cars, with the Aston Martin described as having a lot more 'umph' than a Toyota Corolla, highlighting the intuitive nature of 'umph' as a concept.", "The limitations of the concept of 'umph' are highlighted, stating that it does not effectively explain the mechanics of how a car accelerates, emphasizing the need to delve into the actual working mechanisms rather than relying solely on intuitive concepts.", 'Drawing parallels to the concept of consciousness, the chapter emphasizes that what truly matters is the human perception of consciousness, rather than an absolute definition, highlighting the subjective nature of such concepts.']}, {'end': 2967.136, 'start': 2703.234, 'title': 'Misunderstanding perception and experience', 'summary': "Discusses the misconception around perception and experience, explaining that the words used to describe internal states actually refer to hypothetical situations in the external world, offering insights into the speaker's brain state and actions through hypothetical scenarios.", 'duration': 263.902, 'highlights': ["The language used to describe internal states refers to hypothetical situations in the external world, offering insights into the speaker's brain state and actions through hypothetical scenarios. The words used to describe internal states actually refer to hypothetical situations in the external world, offering insights into the speaker's brain state and actions through hypothetical scenarios.", "The concept of 'experience' and 'feeling' is explained as denoting hypothetical situations and actions that provide insights into the speaker's internal state. The concept of 'experience' and 'feeling' is explained as denoting hypothetical situations and actions that provide insights into the speaker's internal state.", 'The discussion of perception and experience is related to the misconception that these terms refer to some special internal essence, whereas they actually denote hypothetical situations and actions. The discussion of perception and experience is related to the misconception that these terms refer to some special internal essence, whereas they actually denote hypothetical situations and actions.']}], 'duration': 363.771, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2603365.jpg', 'highlights': ["The concept of 'umph' is illustrated through the example of sports cars, emphasizing the intuitive nature of 'umph' as a concept.", "The limitations of the concept of 'umph' are highlighted, emphasizing the need to delve into the actual working mechanisms rather than relying solely on intuitive concepts.", 'The discussion of perception and experience is related to the misconception that these terms refer to some special internal essence, whereas they actually denote hypothetical situations and actions.', "The language used to describe internal states refers to hypothetical situations in the external world, offering insights into the speaker's brain state and actions through hypothetical scenarios.", "The concept of 'experience' and 'feeling' is explained as denoting hypothetical situations and actions that provide insights into the speaker's internal state."]}, {'end': 3535.012, 'segs': [{'end': 2995.735, 'src': 'embed', 'start': 2967.136, 'weight': 6, 'content': [{'end': 2971.64, 'text': "there you go, i think it's got just as much perceptual sensations as we have,", 'start': 2967.136, 'duration': 4.504}, {'end': 2980.832, 'text': "Although the current state of large language models don't exhibit that kind of cohesive internal logic.", 'start': 2972.87, 'duration': 7.962}, {'end': 2982.592, 'text': 'No, but they will.', 'start': 2981.732, 'duration': 0.86}, {'end': 2983.652, 'text': 'They will.', 'start': 2983.192, 'duration': 0.46}, {'end': 2986.913, 'text': 'You think they will? Oh yeah.', 'start': 2983.812, 'duration': 3.101}, {'end': 2988.993, 'text': "I don't think consciousness is..", 'start': 2987.533, 'duration': 1.46}, {'end': 2995.735, 'text': "people treat it like it's like the sound barrier, that you're either below the speed of sound or you're above the speed of sound.", 'start': 2988.993, 'duration': 6.742}], 'summary': 'Large language models lack cohesive logic, but will gain it in the future.', 'duration': 28.599, 'max_score': 2967.136, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2967136.jpg'}, {'end': 3009.165, 'src': 'heatmap', 'start': 2967.136, 'weight': 0.812, 'content': [{'end': 2971.64, 'text': "there you go, i think it's got just as much perceptual sensations as we have,", 'start': 2967.136, 'duration': 4.504}, {'end': 2980.832, 'text': "Although the current state of large language models don't exhibit that kind of cohesive internal logic.", 'start': 2972.87, 'duration': 7.962}, {'end': 2982.592, 'text': 'No, but they will.', 'start': 2981.732, 'duration': 0.86}, {'end': 2983.652, 'text': 'They will.', 'start': 2983.192, 'duration': 0.46}, {'end': 2986.913, 'text': 'You think they will? Oh yeah.', 'start': 2983.812, 'duration': 3.101}, {'end': 2988.993, 'text': "I don't think consciousness is..", 'start': 2987.533, 'duration': 1.46}, {'end': 2995.735, 'text': "people treat it like it's like the sound barrier, that you're either below the speed of sound or you're above the speed of sound.", 'start': 2988.993, 'duration': 6.742}, {'end': 2998.736, 'text': "You've either got a model that hasn't yet got consciousness or you've got there.", 'start': 2995.755, 'duration': 2.981}, {'end': 3000.436, 'text': "It's not like that at all.", 'start': 2999.396, 'duration': 1.04}, {'end': 3006.564, 'text': 'I think a lot of people were impressed by you talking about using MATLAB.', 'start': 3002.543, 'duration': 4.021}, {'end': 3009.165, 'text': "I'm not sure impressed is the right word.", 'start': 3007.464, 'duration': 1.701}], 'summary': 'Discussion on large language models and the potential for consciousness.', 'duration': 42.029, 'max_score': 2967.136, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2967136.jpg'}, {'end': 3044.067, 'src': 'embed', 'start': 3010.445, 'weight': 4, 'content': [{'end': 3011.485, 'text': 'They were surprised.', 'start': 3010.445, 'duration': 1.04}, {'end': 3015.466, 'text': 'But what is your day-to-day work like??', 'start': 3012.085, 'duration': 3.381}, {'end': 3017.586, 'text': 'You have other responsibilities,', 'start': 3015.846, 'duration': 1.74}, {'end': 3031.803, 'text': 'but do you spend more time on conceptualizing and that could happen while taking a walk or taking a shower or do you spend more time on experimenting,', 'start': 3017.586, 'duration': 14.217}, {'end': 3039.105, 'text': 'like on MATLAB, or do you spend more time on running large experiments??', 'start': 3031.803, 'duration': 7.302}, {'end': 3044.067, 'text': 'Okay, it varies a lot over time.', 'start': 3040.386, 'duration': 3.681}], 'summary': 'Work varies with time, involving conceptualizing, experimenting, and running large experiments.', 'duration': 33.622, 'max_score': 3010.445, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ3010445.jpg'}, {'end': 3088.561, 'src': 'embed', 'start': 3065.247, 'weight': 3, 'content': [{'end': 3074.798, 'text': "I spent a lot of time trying to think about more biologically plausible learning algorithms and then programming little systems in MATLAB and discovering why they don't work.", 'start': 3065.247, 'duration': 9.551}, {'end': 3078.082, 'text': "So the point about most original ideas is they're wrong.", 'start': 3075.319, 'duration': 2.763}, {'end': 3083.957, 'text': "And MATLAB is very convenient for quickly showing that they're wrong.", 'start': 3080.374, 'duration': 3.583}, {'end': 3088.561, 'text': 'And very small toy problems like recognizing handwritten digits.', 'start': 3084.858, 'duration': 3.703}], 'summary': 'Exploring biologically plausible learning algorithms in matlab for small toy problems like recognizing handwritten digits.', 'duration': 23.314, 'max_score': 3065.247, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ3065247.jpg'}, {'end': 3279.732, 'src': 'heatmap', 'start': 3217.102, 'weight': 1, 'content': [{'end': 3219.843, 'text': "And I'm lucky that I've got lots of good people to talk to.", 'start': 3217.102, 'duration': 2.741}, {'end': 3223.424, 'text': 'I talk to Terry Sanofsky and he tells me about all sorts of neuroscience things.', 'start': 3220.143, 'duration': 3.281}, {'end': 3226.985, 'text': 'I talk to Josh Tenenbaum and he tells me about all sorts of cognitive science things.', 'start': 3223.964, 'duration': 3.021}, {'end': 3232.007, 'text': 'I talk to James Howland and he tells me lots of cognitive science and psychology things.', 'start': 3228.706, 'duration': 3.301}, {'end': 3235.808, 'text': 'So I get most of my knowledge just from talking to people.', 'start': 3233.067, 'duration': 2.741}, {'end': 3239.229, 'text': 'Your talk at NeurIPS, you mentioned Yann.', 'start': 3236.668, 'duration': 2.561}, {'end': 3242.51, 'text': 'He corrected my pronunciation of his name, LeCun.', 'start': 3239.569, 'duration': 2.941}, {'end': 3251.663, 'text': 'Why did you reference him in that talk? Oh, because for many years he was pushing convolutional neural networks.', 'start': 3242.89, 'duration': 8.773}, {'end': 3258.366, 'text': "And the vision community said, okay, they're fine for little things like handwritten digits, but they'll never work for real images.", 'start': 3252.663, 'duration': 5.703}, {'end': 3267.051, 'text': 'And there was a famous paper he submitted to a conference where him and his co-workers,', 'start': 3259.767, 'duration': 7.284}, {'end': 3272.831, 'text': 'where he actually did better than any other system on a particular benchmark.', 'start': 3267.95, 'duration': 4.881}, {'end': 3275.711, 'text': "I think it was segmenting pedestrians, but I'm not quite sure.", 'start': 3273.191, 'duration': 2.52}, {'end': 3276.872, 'text': 'It was something like that.', 'start': 3275.771, 'duration': 1.101}, {'end': 3279.732, 'text': 'And the paper got rejected, even though it had the best results.', 'start': 3277.352, 'duration': 2.38}], 'summary': "Gained knowledge from conversations with experts in neuroscience, cognitive science, and psychology, including insights on yann lecun's breakthrough in convolutional neural networks at neurips.", 'duration': 67.113, 'max_score': 3217.102, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ3217102.jpg'}, {'end': 3401.58, 'src': 'embed', 'start': 3372.413, 'weight': 0, 'content': [{'end': 3378.477, 'text': 'And then when Fei-Fei Li and her collaborators produced the ImageNet competition, finally,', 'start': 3372.413, 'duration': 6.064}, {'end': 3383.061, 'text': 'we had a big enough data set to show that neural networks would really work well.', 'start': 3378.477, 'duration': 4.584}, {'end': 3393.953, 'text': 'And Yan actually tried to get several different students to make a serious attempt to do the ImageNet with convolutional nets,', 'start': 3384.562, 'duration': 9.391}, {'end': 3395.995, 'text': "but he couldn't find a student who was interested in doing it.", 'start': 3393.953, 'duration': 2.042}, {'end': 3401.58, 'text': 'At the same time, Ilya became very interested in doing it, and I was interested in doing it,', 'start': 3397.256, 'duration': 4.324}], 'summary': "Imagenet competition proved neural networks' success with big data.", 'duration': 29.167, 'max_score': 3372.413, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ3372413.jpg'}], 'start': 2967.136, 'title': 'Learning, programming, and neural networks', 'summary': "Explores the development of large language models, matlab programming, and the influence of learning through conversations. it also delves into the historical resistance and eventual acceptance of convolutional neural networks in computer vision, including yann lecun's work and the emergence of the imagenet competition.", 'chapters': [{'end': 3232.007, 'start': 2967.136, 'title': 'Learning, programming, and perceptual systems', 'summary': 'Discusses the development of large language models, the use of matlab for programming and experimentation, the process of conceptualizing and experimenting, and the influence of learning through conversations with experts in various fields.', 'duration': 264.871, 'highlights': ['The chapter discusses the development of large language models. It mentions that current state of large language models lack cohesive internal logic but will eventually exhibit it.', 'The use of MATLAB for programming and experimentation is emphasized. The speaker extensively uses MATLAB for programming and experimenting, citing its convenience for dealing with vectors and matrices.', 'The process of conceptualizing and experimenting is described. The speaker spends time conceptualizing a more neurally realistic perceptual system and experimenting with biologically plausible learning algorithms.', 'The influence of learning through conversations with experts in various fields is highlighted. The speaker mentions learning most things from conversations, particularly from individuals knowledgeable in neuroscience, cognitive science, and psychology.']}, {'end': 3535.012, 'start': 3233.067, 'title': 'Convolutional neural networks in computer vision', 'summary': "Discusses the historical resistance and eventual acceptance of convolutional neural networks in computer vision, highlighting yann lecun's groundbreaking work and the challenges faced in convincing the computer vision community, culminating in the emergence of the imagenet competition.", 'duration': 301.945, 'highlights': ["Yann LeCun's groundbreaking work on convolutional neural networks and the historical resistance from the vision community Yann LeCun's pioneering efforts in pushing convolutional neural networks and the resistance faced from the vision community before the emergence of the ImageNet competition.", "The rejection of Yann LeCun's paper despite achieving the best results in a benchmark The rejection of Yann LeCun's paper containing the best results in a benchmark, showcasing the initial skepticism towards the effectiveness of neural networks in computer vision.", 'The emergence of the ImageNet competition as a breakthrough showcasing the potential of neural networks in computer vision The significance of the ImageNet competition in demonstrating the potential of neural networks in computer vision and eventually convincing the computer vision community.']}], 'duration': 567.876, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/NWqy_b1OvwQ/pics/NWqy_b1OvwQ2967136.jpg', 'highlights': ['The emergence of the ImageNet competition as a breakthrough showcasing the potential of neural networks in computer vision', "The rejection of Yann LeCun's paper despite achieving the best results in a benchmark", 'The influence of learning through conversations with experts in various fields is highlighted', 'The use of MATLAB for programming and experimentation is emphasized', 'The process of conceptualizing and experimenting is described', "Yann LeCun's groundbreaking work on convolutional neural networks and the historical resistance from the vision community", 'The chapter discusses the development of large language models. It mentions that current state of large language models lack cohesive internal logic but will eventually exhibit it']}], 'highlights': ['The forward-forward algorithm is considered as a potential model for information processing in the cerebral cortex, with the hope of representing 3D structure in hidden layers.', "The application of the back propagation of error algorithm to deep networks set off a revolution in artificial intelligence, emphasizing the significant impact of Hinton's work on deep learning and its influence on the field of artificial intelligence.", 'The forward algorithm divides learning into online and offline phases, aiming to achieve high activity for real data and low activity for fake data at each layer, with specific objectives and logistic functions.', 'The importance of refining basic properties and investigating various aspects before scaling, such as generating negative data effectively and determining the most effective way to use features or constraints in big systems.', "The concept of 'umph' is illustrated through the example of sports cars, emphasizing the intuitive nature of 'umph' as a concept.", 'The emergence of the ImageNet competition as a breakthrough showcasing the potential of neural networks in computer vision', "The rejection of Yann LeCun's paper despite achieving the best results in a benchmark", 'The forward algorithm in neural nets can learn effectively even with a black box inserted in a layer, as it changes the forward path in the same way for both forward passes, thus cancelling out its effect.', "The lack of evidence supporting the notion that the brain uses backpropagation raises doubts about its applicability in neural networks, highlighting the absence of supporting evidence and the doubts surrounding backpropagation's integration into neural networks.", 'The forward-forward algorithm is considered as a potential model for information processing in the cerebral cortex, with the hope of representing 3D structure in hidden layers.']}