title
Kafka Tutorial | Apache Kafka Tutorial For Beginners | Kafka Architecture |What Is Kafka|Simplilearn

description
🔥Post Graduate Program In Data Engineering: https://www.simplilearn.com/pgp-data-engineering-certification-training-course?utm_campaign=Kafka-U4y2R3v9tlY&utm_medium=DescriptionFF&utm_source=youtube 🔥Big Data Engineer Masters Program (Discount Code - YTBE15): https://www.simplilearn.com/big-data-engineer-masters-program?utm_campaign=Kafka-U4y2R3v9tlY&utm_medium=DescriptionFF&utm_source=youtube This Apache Kafka tutorial will help you master the basics of Apache Kafka including concepts of Kafka Cluster, Kafka Data Model, Kafka Topic, Kafka Architecture and Use Case of Kafka at LinkedIn. Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a massively scalable pub/sub message queue architected as a distributed transaction log,making it highly valuable for enterprise infrastructures to process streaming data. Subscribe to Simplilearn channel for more Big Data and Hadoop Tutorials - https://www.youtube.com/user/Simplilearn?sub_confirmation=1 Check our Big Data Training Video Playlist: https://www.youtube.com/playlist?list=PLEiEAq2VkUUJqp1k-g5W1mo37urJQOdCZ Big Data and Analytics Articles - https://www.simplilearn.com/resources/big-data-and-analytics?utm_campaign=Bigdata-Kafka-U4y2R3v9tlY&utm_medium=Tutorials&utm_source=youtube To gain in-depth knowledge of Big Data and Hadoop, check our Big Data Hadoop and Spark Developer Certification Training Course: https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training?utm_campaign=Bigdata-Kafka-U4y2R3v9tlY&utm_medium=Tutorials&utm_source=youtube #bigdata #bigdatatutorialforbeginners #bigdataanalytics #bigdatahadooptutorialforbeginners #bigdatacertification #HadoopTutorial 🔥Explore our FREE Courses: https://www.simplilearn.com/skillup-free-online-courses?utm_campaign=Kafka&utm_medium=Description&utm_source=youtube ➡️ About Post Graduate Program In Data Engineering This Data Engineering course is ideal for professionals, covering critical topics like the Hadoop framework, Data Processing using Spark, Data Pipelines with Kafka, Big Data on AWS, and Azure cloud infrastructures. This program is delivered via live sessions, industry projects, IBM hackathons, and Ask Me Anything sessions. ✅ Key Features Post Graduate Program Certificate and Alumni Association membership - Exclusive Master Classes and Ask me Anything sessions by IBM - 8X higher live interaction in live Data Engineering online classes by industry experts - Capstone from 3 domains and 14+ Projects with Industry datasets from YouTube, Glassdoor, Facebook etc. - Simplilearn's JobAssist helps you get noticed by top hiring companies ✅ Skills Covered - Real-Time Data Processing - Data Pipelining - Big Data Analytics - Data Visualization - Provisioning data storage services - Apache Hadoop - Ingesting Streaming and Batch Data - Transforming Data - Implementing Security Requirements - Data Protection - Encryption Techniques - Data Governance and Compliance Controls 👉 Learn More At: https://www.simplilearn.com/pgp-data-engineering-certification-training-course?utm_campaign=Kafka-U4y2R3v9tlY&utm_medium=Description&utm_source=youtube 🔥🔥 Interested in Attending Live Classes? Call Us: IN - 18002127688 / US - +18445327688

detail
{'title': 'Kafka Tutorial | Apache Kafka Tutorial For Beginners | Kafka Architecture |What Is Kafka|Simplilearn', 'heatmap': [{'end': 159.475, 'start': 145.215, 'weight': 0.775}, {'end': 290.125, 'start': 272.981, 'weight': 0.735}, {'end': 370.228, 'start': 297.588, 'weight': 0.759}, {'end': 620.322, 'start': 604.549, 'weight': 0.77}, {'end': 663.328, 'start': 632.385, 'weight': 0.969}, {'end': 701.957, 'start': 667.052, 'weight': 0.82}, {'end': 743.445, 'start': 729.112, 'weight': 0.731}], 'summary': 'Covers kafka as a high-performance real-time messaging system, capable of processing millions of messages per second. it outlines its use cases, data model, including messages and topics, and processes involving producers, consumers, and brokers. additionally, it details kafka topic partitions, replication, messaging system organization, architecture, and its role in distributed applications.', 'chapters': [{'end': 290.125, 'segs': [{'end': 81.317, 'src': 'embed', 'start': 58.715, 'weight': 0, 'content': [{'end': 66.603, 'text': 'The characteristics of Kafka are, Kafka is a distributed and partitioned messaging system that is highly fault tolerant and scalable.', 'start': 58.715, 'duration': 7.888}, {'end': 72.53, 'text': 'It has been tested to process and send millions of messages per second to several receivers.', 'start': 67.444, 'duration': 5.086}, {'end': 81.317, 'text': 'Apache Kafka was originally developed by LinkedIn to handle their log files and later handed over to the open source community in early 2011.', 'start': 73.672, 'duration': 7.645}], 'summary': 'Kafka is a fault-tolerant messaging system, processing millions of messages per second, developed by linkedin.', 'duration': 22.602, 'max_score': 58.715, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY58715.jpg'}, {'end': 169.661, 'src': 'heatmap', 'start': 145.215, 'weight': 1, 'content': [{'end': 150.519, 'text': 'Event Sourcing A time-ordered sequence of events can be maintained through Kafka.', 'start': 145.215, 'duration': 5.304}, {'end': 159.475, 'text': 'Kafka can be used to aggregate user activity data, such as clicks, navigation, and searches from different websites of an organization.', 'start': 151.59, 'duration': 7.885}, {'end': 166.679, 'text': "Such users' activities can be sent to a real-time monitoring system and Hadoop system for offline processing.", 'start': 160.095, 'duration': 6.584}, {'end': 169.661, 'text': 'An example is illustrated in the image.', 'start': 167.84, 'duration': 1.821}], 'summary': 'Kafka enables maintaining time-ordered user activity events for real-time monitoring and offline processing.', 'duration': 52.58, 'max_score': 145.215, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY145215.jpg'}, {'end': 266.758, 'src': 'embed', 'start': 194.423, 'weight': 3, 'content': [{'end': 197.584, 'text': 'The Kafka data model consists of messages and topics.', 'start': 194.423, 'duration': 3.161}, {'end': 205.927, 'text': 'Messages represent information such as lines in a log file, a row of stock market data, or an error message from a system.', 'start': 198.544, 'duration': 7.383}, {'end': 209.515, 'text': 'Messages are grouped into categories called topics.', 'start': 207.054, 'duration': 2.461}, {'end': 212.477, 'text': 'For example, log message and stock message.', 'start': 209.976, 'duration': 2.501}, {'end': 218.641, 'text': 'The processes that publish messages into a topic in Kafka are known as producers.', 'start': 213.698, 'duration': 4.943}, {'end': 224.264, 'text': 'The processes that receive the messages from a topic in Kafka are known as consumers.', 'start': 219.701, 'duration': 4.563}, {'end': 229.967, 'text': 'The processes or servers within Kafka that process the messages are known as brokers.', 'start': 225.184, 'duration': 4.783}, {'end': 234.69, 'text': 'A Kafka cluster consists of a set of brokers that process the messages.', 'start': 230.747, 'duration': 3.943}, {'end': 237.677, 'text': 'The image illustrates the Kafka data model.', 'start': 235.595, 'duration': 2.082}, {'end': 241.48, 'text': 'It shows a Kafka cluster that consists of three brokers.', 'start': 238.417, 'duration': 3.063}, {'end': 248.266, 'text': 'There are two producers sending messages to the Kafka cluster and two consumers receiving the messages from the cluster.', 'start': 242.181, 'duration': 6.085}, {'end': 254.691, 'text': 'Producer one creates messages for topic one, whereas producer two sends messages for topic two.', 'start': 249.227, 'duration': 5.464}, {'end': 260.454, 'text': 'These messages are processed by the three brokers in parallel and sent to the consumers.', 'start': 255.692, 'duration': 4.762}, {'end': 266.758, 'text': 'Consumer 1 is interested in topic 2, so it receives the messages for topic 2.', 'start': 261.415, 'duration': 5.343}], 'summary': 'Kafka data model: messages, topics, producers, consumers, brokers. cluster has 3 brokers, 2 producers, and 2 consumers.', 'duration': 72.335, 'max_score': 194.423, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY194423.jpg'}, {'end': 296.888, 'src': 'heatmap', 'start': 272.981, 'weight': 0.735, 'content': [{'end': 279.985, 'text': 'The brokers in the Kafka cluster handle the process of receiving, storing, and forwarding the messages to the interested consumers.', 'start': 272.981, 'duration': 7.004}, {'end': 284.027, 'text': 'A topic is a category of messages in Kafka.', 'start': 281.385, 'duration': 2.642}, {'end': 290.125, 'text': 'The producers publish the messages into topics and the consumers read the messages from topics.', 'start': 284.963, 'duration': 5.162}, {'end': 293.807, 'text': 'A topic is divided into one or more partitions.', 'start': 291.046, 'duration': 2.761}, {'end': 296.888, 'text': 'A partition is also known as a commit log.', 'start': 294.607, 'duration': 2.281}], 'summary': 'Kafka brokers handle message processing, topics are categories, with one or more partitions, also known as commit logs.', 'duration': 23.907, 'max_score': 272.981, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY272981.jpg'}], 'start': 31.184, 'title': "Kafka's data model and processes", 'summary': 'Introduces kafka as a high-performance real-time messaging system, capable of processing and sending millions of messages per second. it outlines its use cases and explains its data model, consisting of messages and topics, and processes involving producers, consumers, and brokers in a kafka cluster.', 'chapters': [{'end': 193.562, 'start': 31.184, 'title': 'Introduction to kafka', 'summary': 'Introduces kafka as a high-performance real-time messaging system, part of apache projects, capable of processing and sending millions of messages per second, and outlines its use cases including real-time stream processing, log aggregation, commit log service, event sourcing, and user activity data aggregation.', 'duration': 162.378, 'highlights': ["Kafka is a distributed and partitioned messaging system that is highly fault tolerant and scalable, tested to process and send millions of messages per second to several receivers. Kafka's characteristics include being distributed, highly fault-tolerant, and scalable, with the capability to process and send millions of messages per second.", "Kafka can be used for various purposes in an organization, such as messaging service, real-time stream processing, log aggregation, commit log service, and event sourcing. Kafka's versatile use cases include messaging service, real-time stream processing, log aggregation, commit log service, and event sourcing.", 'Kafka can be used to aggregate user activity data, such as clicks, navigation, and searches from different websites of an organization, and send it to a real-time monitoring system and Hadoop system for offline processing. Kafka enables the aggregation and real-time processing of user activity data, including clicks, navigation, and searches, from different websites, for monitoring and offline processing.']}, {'end': 290.125, 'start': 194.423, 'title': 'Kafka data model and processes', 'summary': 'Introduces the kafka data model, explaining that it consists of messages and topics, with messages being grouped into categories called topics. it also illustrates a kafka cluster with producers sending messages to the cluster and consumers receiving the messages, all of which are processed by brokers.', 'duration': 95.702, 'highlights': ['Kafka data model consists of messages and topics, with messages representing information like lines in a log file or stock market data. This highlights the fundamental components of the Kafka data model, emphasizing the types of information messages can represent.', 'A Kafka cluster consists of a set of brokers that process the messages, with the illustrated example showing a Kafka cluster with three brokers. This illustrates the structure of a Kafka cluster, underlining the role of brokers and the parallel processing of messages.', 'Producers publish messages into topics, and consumers read the messages from topics, with the example showing two producers sending messages to the Kafka cluster and two consumers receiving the messages. This highlights the interaction between producers, topics, and consumers within a Kafka cluster, emphasizing the flow of messages and the role of producers and consumers.']}], 'duration': 258.941, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY31184.jpg', 'highlights': ['Kafka is a distributed and partitioned messaging system that is highly fault tolerant and scalable, tested to process and send millions of messages per second to several receivers.', 'Kafka can be used for various purposes in an organization, such as messaging service, real-time stream processing, log aggregation, commit log service, and event sourcing.', 'Kafka can be used to aggregate user activity data, such as clicks, navigation, and searches from different websites of an organization, and send it to a real-time monitoring system and Hadoop system for offline processing.', 'Kafka data model consists of messages and topics, with messages representing information like lines in a log file or stock market data.', 'A Kafka cluster consists of a set of brokers that process the messages, with the illustrated example showing a Kafka cluster with three brokers.', 'Producers publish messages into topics, and consumers read the messages from topics, with the example showing two producers sending messages to the Kafka cluster and two consumers receiving the messages.']}, {'end': 532.771, 'segs': [{'end': 373.129, 'src': 'heatmap', 'start': 291.046, 'weight': 0, 'content': [{'end': 293.807, 'text': 'A topic is divided into one or more partitions.', 'start': 291.046, 'duration': 2.761}, {'end': 296.888, 'text': 'A partition is also known as a commit log.', 'start': 294.607, 'duration': 2.281}, {'end': 300.089, 'text': 'Each partition contains an ordered set of messages.', 'start': 297.588, 'duration': 2.501}, {'end': 303.891, 'text': 'Each message is identified by its offset in the partition.', 'start': 300.79, 'duration': 3.101}, {'end': 308.493, 'text': 'Messages are added at one end of the partition and consumed at the other.', 'start': 304.731, 'duration': 3.762}, {'end': 313.815, 'text': 'The image illustrates a topic simple that is divided into two partitions.', 'start': 309.613, 'duration': 4.202}, {'end': 318.256, 'text': 'The writes are completed at one end and the reads are completed at the other.', 'start': 314.875, 'duration': 3.381}, {'end': 324.418, 'text': 'It shows 6 messages in partition 0 and 5 messages in partition 1.', 'start': 319.296, 'duration': 5.122}, {'end': 329.16, 'text': 'The offset of message 1 in partition 0 is 0 as it is the first message.', 'start': 324.418, 'duration': 4.742}, {'end': 334.093, 'text': 'The offset of message 6 in partition 0 is 5.', 'start': 330.14, 'duration': 3.953}, {'end': 344.583, 'text': 'The messages are written in the order 1, 2, 3, 4, 5 and 6, whereas they are read in the same order as 1, 2, 3, 4, 5 and 6.', 'start': 334.093, 'duration': 10.49}, {'end': 350.441, 'text': 'The next message in partition 0 will be message 7, which will be written at offset 6.', 'start': 344.583, 'duration': 5.858}, {'end': 356.643, 'text': 'The next message for partition 1 will be message 6, which will be written at offset 5.', 'start': 350.441, 'duration': 6.202}, {'end': 361.205, 'text': 'Topics are divided into partitions, which are the unit of parallelism in Kafka.', 'start': 356.643, 'duration': 4.562}, {'end': 370.228, 'text': 'Partitions allow messages in a topic to be distributed to multiple servers or brokers so that the messages in a topic can be processed in parallel.', 'start': 362.045, 'duration': 8.183}, {'end': 373.129, 'text': 'A topic can have any number of partitions.', 'start': 371.068, 'duration': 2.061}], 'summary': 'Kafka topics are divided into partitions to enable parallel processing, with each partition containing an ordered set of messages and unique offsets.', 'duration': 82.083, 'max_score': 291.046, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY291046.jpg'}, {'end': 384.886, 'src': 'embed', 'start': 362.045, 'weight': 1, 'content': [{'end': 370.228, 'text': 'Partitions allow messages in a topic to be distributed to multiple servers or brokers so that the messages in a topic can be processed in parallel.', 'start': 362.045, 'duration': 8.183}, {'end': 373.129, 'text': 'A topic can have any number of partitions.', 'start': 371.068, 'duration': 2.061}, {'end': 376.43, 'text': 'Each partition should fit in a single Kafka server.', 'start': 373.789, 'duration': 2.641}, {'end': 379.422, 'text': 'number of partitions in a topic.', 'start': 377.541, 'duration': 1.881}, {'end': 382.485, 'text': 'decide the parallelism of the topic.', 'start': 379.422, 'duration': 3.063}, {'end': 384.886, 'text': 'the image illustrates two partitions of a topic.', 'start': 382.485, 'duration': 2.401}], 'summary': 'Partitions enable parallel processing in kafka topics; any number of partitions can be used to determine topic parallelism.', 'duration': 22.841, 'max_score': 362.045, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY362045.jpg'}, {'end': 483.198, 'src': 'embed', 'start': 410.153, 'weight': 2, 'content': [{'end': 414.854, 'text': 'The leader controls the read and write for the partition, whereas the followers replicate the data.', 'start': 410.153, 'duration': 4.701}, {'end': 419.435, 'text': 'If a leader fails, one of the followers automatically become the leader.', 'start': 415.714, 'duration': 3.721}, {'end': 424.277, 'text': 'Zookeeper is used for the leader selection as explained in the previous lesson.', 'start': 420.456, 'duration': 3.821}, {'end': 428.094, 'text': 'The image illustrates the partition of a topic simple.', 'start': 425.492, 'duration': 2.602}, {'end': 435.219, 'text': 'Here, the partition 0 is assigned to server 1 and partition 1 is assigned to server 2.', 'start': 428.855, 'duration': 6.364}, {'end': 438.722, 'text': 'These servers process the messages in parallel to increase throughput.', 'start': 435.219, 'duration': 3.503}, {'end': 442.845, 'text': 'The producer is the creator of the message in Kafka.', 'start': 440.163, 'duration': 2.682}, {'end': 449.31, 'text': 'Producers place the message to a particular topic and decide what partition to place the message into.', 'start': 443.525, 'duration': 5.785}, {'end': 454.374, 'text': 'For example, a producer may place a message into partition 0 of topic simple.', 'start': 450.01, 'duration': 4.364}, {'end': 459.141, 'text': 'Another producer may place a message into partition one of TopicSimple.', 'start': 455.399, 'duration': 3.742}, {'end': 463.803, 'text': 'Topics should already exist before a message is placed by the producer.', 'start': 460.121, 'duration': 3.682}, {'end': 467.705, 'text': 'Messages are added at one end of the partition by Kafka.', 'start': 464.664, 'duration': 3.041}, {'end': 474.609, 'text': 'The image illustrates a producer that creates three messages and sends them to different topics and partitions in Kafka.', 'start': 468.766, 'duration': 5.843}, {'end': 479.595, 'text': 'Message 1 is sent to partition 0 of topic test 1.', 'start': 475.731, 'duration': 3.864}, {'end': 483.198, 'text': 'Message 2 is sent to partition 1 of topic test 1.', 'start': 479.595, 'duration': 3.603}], 'summary': 'Kafka partitions enable parallel processing, with leaders controlling read/write and automatic failover, leveraging zookeeper for leader selection.', 'duration': 73.045, 'max_score': 410.153, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY410153.jpg'}], 'start': 291.046, 'title': 'Kafka topic partitions, partitioning, replication, and messaging system', 'summary': 'Details kafka topic partitions, covering the division and ordering of messages, parallel processing, fault tolerance, replication, leader-follower model, and the messaging system organization into topics, partitions, and consumer groups.', 'chapters': [{'end': 361.205, 'start': 291.046, 'title': 'Kafka topic partitions', 'summary': 'Explains the concept of kafka topic partitions, illustrating the division of topics into partitions, the ordering of messages within each partition, and the role of partitions as the unit of parallelism in kafka.', 'duration': 70.159, 'highlights': ['Each partition in Kafka contains an ordered set of messages, identified by their offsets, and they are added at one end and consumed at the other.', 'The image illustrates a topic divided into two partitions, with 6 messages in partition 0 and 5 messages in partition 1, showing the ordering of messages and their offsets.', 'Topics are divided into partitions, which serve as the unit of parallelism in Kafka, allowing for concurrent processing and scalability.']}, {'end': 435.219, 'start': 362.045, 'title': 'Kafka partitioning and replication', 'summary': 'Explains how partitions allow parallel processing of messages in a topic, with each partition being handled by a kafka server, and how replication and leader-follower model ensure fault tolerance and data availability in kafka clusters.', 'duration': 73.174, 'highlights': ['Partitions enable parallel processing of messages in a topic by distributing them across multiple servers or brokers, thus determining the parallelism of the topic. Partitions allow messages in a topic to be distributed to multiple servers or brokers so that the messages in a topic can be processed in parallel.', 'Leader-follower model and data replication across servers ensure fault tolerance and high availability, with automatic leader selection in case of leader failure. For fault tolerance, one server is marked as a leader for the partition and the others are marked as followers. If a leader fails, one of the followers automatically becomes the leader.', 'Illustrates the distribution of partitions across Kafka servers and the role of Zookeeper in leader selection for partitions. The image illustrates the partition of a topic simple, where the partition 0 is assigned to server 1 and partition 1 is assigned to server 2. Zookeeper is used for the leader selection as explained in the previous lesson.']}, {'end': 532.771, 'start': 435.219, 'title': 'Kafka messaging system', 'summary': 'Explains how messages are produced and consumed in kafka, with messages being placed in specific topics and partitions, and consumers organized into consumer groups to control the messaging system.', 'duration': 97.552, 'highlights': ['Messages are added at one end of the partition by Kafka. This highlights the process of adding messages to partitions by Kafka, showcasing the functionality and efficiency of the system.', 'The producer is the creator of the message in Kafka. This emphasizes the role of the producer in creating and placing messages in Kafka, indicating their pivotal role in the messaging process.', 'The consumer groups are used to control the messaging system. This highlights the purpose of consumer groups in controlling the messaging system, emphasizing the organizational aspect of managing message consumption.']}], 'duration': 241.725, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY291046.jpg', 'highlights': ['Each partition in Kafka contains an ordered set of messages, identified by their offsets, and they are added at one end and consumed at the other.', 'Partitions enable parallel processing of messages in a topic by distributing them across multiple servers or brokers, thus determining the parallelism of the topic.', 'Leader-follower model and data replication across servers ensure fault tolerance and high availability, with automatic leader selection in case of leader failure.', 'Messages are added at one end of the partition by Kafka. This highlights the process of adding messages to partitions by Kafka, showcasing the functionality and efficiency of the system.', 'The producer is the creator of the message in Kafka. This emphasizes the role of the producer in creating and placing messages in Kafka, indicating their pivotal role in the messaging process.']}, {'end': 988.688, 'segs': [{'end': 573.795, 'src': 'embed', 'start': 533.991, 'weight': 0, 'content': [{'end': 540.195, 'text': 'Kafka architecture consists of brokers that take messages from the producers and add to a partition of a topic.', 'start': 533.991, 'duration': 6.204}, {'end': 544.518, 'text': 'Brokers provide the messages to the consumers from the partitions.', 'start': 541.176, 'duration': 3.342}, {'end': 551.283, 'text': 'The producers create the messages and send them to a particular topic and a partition of a Kafka cluster.', 'start': 545.639, 'duration': 5.644}, {'end': 554.805, 'text': 'A topic is divided into multiple partitions.', 'start': 552.404, 'duration': 2.401}, {'end': 559.689, 'text': 'The messages are added to the partitions at one end and consumed in the same order.', 'start': 555.606, 'duration': 4.083}, {'end': 562.43, 'text': 'Each partition acts as a message queue.', 'start': 560.509, 'duration': 1.921}, {'end': 565.513, 'text': 'Consumers are divided into consumer groups.', 'start': 563.373, 'duration': 2.14}, {'end': 569.374, 'text': 'Each message is delivered to one consumer in each consumer group.', 'start': 566.114, 'duration': 3.26}, {'end': 573.795, 'text': 'Zookeeper is used for coordination among the Kafka brokers.', 'start': 570.514, 'duration': 3.281}], 'summary': 'Kafka architecture includes brokers, partitions, and zookeeper for coordination.', 'duration': 39.804, 'max_score': 533.991, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY533991.jpg'}, {'end': 624.344, 'src': 'heatmap', 'start': 597.383, 'weight': 2, 'content': [{'end': 603.408, 'text': 'The image also illustrates that the Kafka cluster or brokers interact with Zookeeper for distributed coordination.', 'start': 597.383, 'duration': 6.025}, {'end': 610.474, 'text': 'Kafka architecture supports two types of messaging systems known as Publish-Subscribe and Queue system.', 'start': 604.549, 'duration': 5.925}, {'end': 614.377, 'text': 'The Publish-Subscribe system is also called Pub-Sub.', 'start': 611.655, 'duration': 2.722}, {'end': 620.322, 'text': 'In this system, one system broadcasts the messages and the consumers subscribe to receive the messages.', 'start': 615.058, 'duration': 5.264}, {'end': 624.344, 'text': 'Each message is received by all the subscribers.', 'start': 621.403, 'duration': 2.941}], 'summary': 'Kafka architecture supports pub-sub and queue systems for messaging, with messages received by all subscribers.', 'duration': 26.961, 'max_score': 597.383, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY597383.jpg'}, {'end': 663.328, 'src': 'heatmap', 'start': 632.385, 'weight': 0.969, 'content': [{'end': 636.806, 'text': 'In the queue system, each message has to be consumed by only one consumer.', 'start': 632.385, 'duration': 4.421}, {'end': 644.528, 'text': 'If there are multiple consumers, each message is consumed by any one of the available consumers in the same order that they are received.', 'start': 637.647, 'duration': 6.881}, {'end': 649.221, 'text': 'The image illustrates the implementation of a queue system.', 'start': 646.28, 'duration': 2.941}, {'end': 654.044, 'text': 'Consumer 1, Consumer 2, and Consumer 3 belong to the same consumer group.', 'start': 649.842, 'duration': 4.202}, {'end': 663.328, 'text': 'So out of the six messages, two messages are received by Consumer 1, two messages by Consumer 2, and two messages by Consumer 3.', 'start': 654.664, 'duration': 8.664}], 'summary': 'Queue system ensures one-time consumption, with 6 messages evenly distributed among 3 consumers.', 'duration': 30.943, 'max_score': 632.385, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY632385.jpg'}, {'end': 701.957, 'src': 'heatmap', 'start': 667.052, 'weight': 0.82, 'content': [{'end': 674.48, 'text': 'so consumer 1 receives message 1, consumer 2 receives message 2 and consumer 3 receives message 3..', 'start': 667.052, 'duration': 7.428}, {'end': 683.348, 'text': 'after this consumer 1 receives message 4 consumer 2 receives message 5 and consumer 3 receives message 6.', 'start': 674.48, 'duration': 8.868}, {'end': 686.85, 'text': 'The image illustrates the implementation of a publish-subscribe system.', 'start': 683.348, 'duration': 3.502}, {'end': 692.092, 'text': 'Consumer 1, Consumer 2, and Consumer 3 belong to three separate consumer groups.', 'start': 687.47, 'duration': 4.622}, {'end': 701.957, 'text': 'So all the six messages are sent to all the three consumer groups called Consumer Group 1, Consumer Group 2, and Consumer Group 3.', 'start': 692.973, 'duration': 8.984}], 'summary': 'Three consumer groups receive 6 messages in a publish-subscribe system.', 'duration': 34.905, 'max_score': 667.052, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY667052.jpg'}, {'end': 767.541, 'src': 'heatmap', 'start': 729.112, 'weight': 3, 'content': [{'end': 735.533, 'text': 'One broker acts as a leader for a partition and handles the delivery and persistence, whereas the others act as followers.', 'start': 729.112, 'duration': 6.421}, {'end': 740.275, 'text': 'Brokers receive the message from the producer and send it to the consumer groups.', 'start': 736.434, 'duration': 3.841}, {'end': 743.445, 'text': 'Kafka guarantees the following.', 'start': 741.704, 'duration': 1.741}, {'end': 750.19, 'text': 'Guarantee one, messages sent by a producer to a topic and a partition are appended in the same order.', 'start': 744.106, 'duration': 6.084}, {'end': 755.774, 'text': 'This ensures that the messages produced earlier do not get ahead of the messages produced later.', 'start': 750.93, 'duration': 4.844}, {'end': 758.836, 'text': 'The time order is maintained very strictly.', 'start': 756.554, 'duration': 2.282}, {'end': 767.541, 'text': 'Guarantee two, a consumer instance gets the messages in the same order as they are produced, which means that the messages are never out of order.', 'start': 759.956, 'duration': 7.585}], 'summary': 'Kafka ensures strict time order and message order for producers and consumers.', 'duration': 38.429, 'max_score': 729.112, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY729112.jpg'}, {'end': 841.578, 'src': 'embed', 'start': 811.666, 'weight': 4, 'content': [{'end': 821.454, 'text': 'some of the uses of kafka at linkedin are as follows monitoring kafka is used to collect metrics from various systems and to create monitoring dashboards.', 'start': 811.666, 'duration': 9.788}, {'end': 830.555, 'text': 'messaging Kafka is used as message queues for content feeds and as publish subscribe system for searches.', 'start': 822.532, 'duration': 8.023}, {'end': 841.578, 'text': 'analytics Kafka is used to collect page views and clicks from customer facing websites and to store the information into a central hadoop based analytic system.', 'start': 830.555, 'duration': 11.023}], 'summary': 'Linkedin uses kafka for monitoring, messaging, and analytics to collect metrics, handle content feeds, and store page views into a central hadoop-based system.', 'duration': 29.912, 'max_score': 811.666, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY811666.jpg'}, {'end': 948.765, 'src': 'embed', 'start': 878.961, 'weight': 6, 'content': [{'end': 883.244, 'text': 'however, kafka will write a copy to the machine once it is back.', 'start': 878.961, 'duration': 4.283}, {'end': 887.848, 'text': 'if the leader fails, one of the followers will be chosen as the new leader.', 'start': 883.244, 'duration': 4.604}, {'end': 894.913, 'text': 'this mechanism can tolerate n minus one failures if the replication factor is in which can be specified at the topic level.', 'start': 887.848, 'duration': 7.065}, {'end': 900.718, 'text': 'Persistence means a message can be delivered even if the machine that handles the message fails.', 'start': 896.155, 'duration': 4.563}, {'end': 904.94, 'text': 'Kafka uses the Linux file system for persistence of messages.', 'start': 901.658, 'duration': 3.282}, {'end': 908.102, 'text': 'Persistence ensures no messages are lost.', 'start': 905.941, 'duration': 2.161}, {'end': 912.865, 'text': 'Kafka relies on the file system page cache for fast reads and writes.', 'start': 908.863, 'duration': 4.002}, {'end': 920.15, 'text': 'All the data is immediately written to a file in a file system so that they can be recovered even if the machine fails.', 'start': 913.926, 'duration': 6.224}, {'end': 924.758, 'text': 'Messages are grouped as message sets for more efficient writes.', 'start': 921.437, 'duration': 3.321}, {'end': 928.579, 'text': 'Message sets can be compressed to reduce network bandwidth.', 'start': 925.658, 'duration': 2.921}, {'end': 935.941, 'text': 'A standardized binary message format is used among producers, brokers, and consumers to minimize data modification.', 'start': 929.359, 'duration': 6.582}, {'end': 939.362, 'text': 'Here is a quick recap of what we have learned in this lesson.', 'start': 936.781, 'duration': 2.581}, {'end': 943.763, 'text': 'Kafka is a high-performance real-time messaging system.', 'start': 940.383, 'duration': 3.38}, {'end': 948.765, 'text': 'Kafka can be used as an external commit log for distributed systems.', 'start': 944.964, 'duration': 3.801}], 'summary': 'Kafka can tolerate n-1 failures, uses linux file system for persistence, and is a high-performance messaging system.', 'duration': 69.804, 'max_score': 878.961, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY878961.jpg'}], 'start': 533.991, 'title': 'Kafka messaging and architecture', 'summary': "Provides an overview of kafka architecture, emphasizing the role of brokers, partitions, producers, consumers, and zookeeper, as well as its support for publish-subscribe and queue systems. it also explores kafka's messaging system, guarantees, and usage at linkedin, including monitoring, messaging, analytics, and its role in distributed applications, along with primary backup methods and data format for minimizing data modification.", 'chapters': [{'end': 620.322, 'start': 533.991, 'title': 'Kafka architecture overview', 'summary': 'Introduces kafka architecture, emphasizing the role of brokers, partitions, producers, consumers, and zookeeper in facilitating message transmission and coordination, and highlighting the support for publish-subscribe and queue systems.', 'duration': 86.331, 'highlights': ['Kafka architecture consists of brokers, partitions, producers, consumers, and Zookeeper for coordination. The architecture includes components such as brokers, partitions, producers, consumers, and Zookeeper, emphasizing the essential elements of the Kafka framework.', 'Messages are added to partitions and consumed in the same order, functioning as message queues. The messages are added to the partitions at one end and consumed in the same order, highlighting the FIFO behavior of message consumption within partitions.', 'Kafka supports two types of messaging systems: Publish-Subscribe and Queue system. The Kafka architecture supports two types of messaging systems, emphasizing the flexibility and versatility of the platform in accommodating different messaging paradigms.']}, {'end': 855.723, 'start': 621.403, 'title': 'Kafka messaging system overview', 'summary': 'Explains the messaging system of kafka, including its guarantees and usage at linkedin, such as monitoring, messaging, analytics, and its role as a building block for distributed applications.', 'duration': 234.32, 'highlights': ['Kafka guarantees message order and fault tolerance Kafka guarantees the strict maintenance of time order for messages and ensures that a consumer instance receives messages in the same order as they are produced. It also provides fault tolerance, with a topic with replication factor N tolerating up to N-1 server failures.', 'Kafka usage at LinkedIn LinkedIn uses Kafka for monitoring, messaging, analytics, and as a building block for distributed applications, including collecting metrics, serving as message queues, handling publish-subscribe systems, collecting page views and clicks, and serving as a primary backup method for replication.', "Kafka's role as a messaging system Kafka is used as message queues for content feeds and as a publish-subscribe system for searches. It is also used to collect page views and clicks from customer-facing websites and to store the information into a central Hadoop-based analytic system."]}, {'end': 988.688, 'start': 856.604, 'title': 'Kafka messaging system overview', 'summary': 'Explains the primary backup method in kafka, tolerating n minus one failures if the replication factor is specified at the topic level, using linux file system for persistence, and employing a standardized binary message format to minimize data modification.', 'duration': 132.084, 'highlights': ['Kafka uses the Linux file system for persistence of messages, ensuring no messages are lost, and relies on the file system page cache for fast reads and writes. Linux file system for persistence of messages, ensures no messages are lost, relies on file system page cache for fast reads and writes.', 'The primary backup method in Kafka can tolerate n minus one failures if the replication factor is specified at the topic level. Primary backup method tolerates n minus one failures if replication factor specified at topic level.', 'Kafka employs a standardized binary message format among producers, brokers, and consumers to minimize data modification. Standardized binary message format minimizes data modification.']}], 'duration': 454.697, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/U4y2R3v9tlY/pics/U4y2R3v9tlY533991.jpg', 'highlights': ['Kafka architecture consists of brokers, partitions, producers, consumers, and Zookeeper for coordination.', 'Messages are added to partitions and consumed in the same order, functioning as message queues.', 'Kafka supports two types of messaging systems: Publish-Subscribe and Queue system.', 'Kafka guarantees message order and fault tolerance.', 'Kafka usage at LinkedIn includes monitoring, messaging, analytics, and as a building block for distributed applications.', "Kafka's role as a messaging system includes message queues for content feeds and as a publish-subscribe system for searches.", 'Kafka uses the Linux file system for persistence of messages, ensuring no messages are lost.', 'The primary backup method in Kafka can tolerate n minus one failures if the replication factor is specified at the topic level.', 'Kafka employs a standardized binary message format among producers, brokers, and consumers to minimize data modification.']}], 'highlights': ['Kafka is a distributed and partitioned messaging system that is highly fault tolerant and scalable, tested to process and send millions of messages per second to several receivers.', 'Kafka can be used for various purposes in an organization, such as messaging service, real-time stream processing, log aggregation, commit log service, and event sourcing.', 'Kafka can be used to aggregate user activity data, such as clicks, navigation, and searches from different websites of an organization, and send it to a real-time monitoring system and Hadoop system for offline processing.', 'Kafka data model consists of messages and topics, with messages representing information like lines in a log file or stock market data.', 'A Kafka cluster consists of a set of brokers that process the messages, with the illustrated example showing a Kafka cluster with three brokers.', 'Producers publish messages into topics, and consumers read the messages from topics, with the example showing two producers sending messages to the Kafka cluster and two consumers receiving the messages.', 'Each partition in Kafka contains an ordered set of messages, identified by their offsets, and they are added at one end and consumed at the other.', 'Partitions enable parallel processing of messages in a topic by distributing them across multiple servers or brokers, thus determining the parallelism of the topic.', 'Leader-follower model and data replication across servers ensure fault tolerance and high availability, with automatic leader selection in case of leader failure.', 'Messages are added at one end of the partition by Kafka. This highlights the process of adding messages to partitions by Kafka, showcasing the functionality and efficiency of the system.', 'The producer is the creator of the message in Kafka. This emphasizes the role of the producer in creating and placing messages in Kafka, indicating their pivotal role in the messaging process.', 'Kafka architecture consists of brokers, partitions, producers, consumers, and Zookeeper for coordination.', 'Messages are added to partitions and consumed in the same order, functioning as message queues.', 'Kafka supports two types of messaging systems: Publish-Subscribe and Queue system.', 'Kafka guarantees message order and fault tolerance.', 'Kafka usage at LinkedIn includes monitoring, messaging, analytics, and as a building block for distributed applications.', "Kafka's role as a messaging system includes message queues for content feeds and as a publish-subscribe system for searches.", 'Kafka uses the Linux file system for persistence of messages, ensuring no messages are lost.', 'The primary backup method in Kafka can tolerate n minus one failures if the replication factor is specified at the topic level.', 'Kafka employs a standardized binary message format among producers, brokers, and consumers to minimize data modification.']}