{"id":943,"date":"2020-01-24T07:24:41","date_gmt":"2020-01-24T06:24:41","guid":{"rendered":"http:\/\/www.klaushaller.net\/?page_id=943"},"modified":"2020-01-24T10:01:46","modified_gmt":"2020-01-24T09:01:46","slug":"part-1-apache-kafka-first-steps-overview-architecture","status":"publish","type":"page","link":"https:\/\/www.klaushaller.net\/?page_id=943","title":{"rendered":"Apache Kafka Tutorial, Part 1"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">Part 1: \u201cFirst Steps\u201d: Overview &amp; Architecture<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">In this tutorial, I provide a broad overview on the Kafka technology for development and operations as well covering the following steps. In the first tutorial, you have:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>A basic understanding of the Kafka architecture<\/li><li>A single node Kafka installation up and running<\/li><li>Using the command line interface, you can start a producer and send information<\/li><li>Using the command line interface, you can start a consumer and receive information<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Please note\nthat we do not discuss the scenarios when to use Kafka. Also, the tutorial is\nbased on Windows 10. If you work with Linux, some of the commands in section 1\nmight differ.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1.1 Kafka Architecture<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">From the\nperspective of developers, Kafka is a pub\/sub (publish and subscribe) solution\nenabling various applications to talk with each other. The senders (or \u201c<strong>producers<\/strong>\u201d\nin Kafka terminology) do not have to know who might be interested in the\nmessages or events they share. They publish information related to certain\ntopics. The receivers (or \u201c<strong>consumers<\/strong>\u201d in Kafka) also do not have to know\nwho exactly creates events and message they are interested in. They just have\nto subscribe to a topic and get all the information.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">From an\nadministrator perspective, a Kafka installation consists of a <strong>Zookeeper<\/strong>\napplication as a kind of orchestrator and one or more <strong>brokers<\/strong> that\nprovide the actual functionality for producers and consumers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This first tutorial focuses on a simple installation: one Zookeeper instance and one broker as illustrated in the figure below. Load balancing and fault-tolerance are discussed in Part 2 <em><a href=\"http:\/\/www.klaushaller.net\/?page_id=945\">Kafka Fault Tolerance using Replica<\/a><\/em> and Part 3 <em><a href=\"http:\/\/www.klaushaller.net\/?page_id=947\">Kafka Throughput Optimization using Partitions and Consumer Groups<\/a><\/em> of my tutorial. Security topics are discussed in Part 4: Kafka Security Basics.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"551\" height=\"270\" src=\"http:\/\/www.klaushaller.net\/wp-content\/uploads\/2020\/01\/Kafka-Tutorial-Klaus-Haller-Part-1-2.png\" alt=\"\" class=\"wp-image-962\" srcset=\"https:\/\/www.klaushaller.net\/wp-content\/uploads\/2020\/01\/Kafka-Tutorial-Klaus-Haller-Part-1-2.png 551w, https:\/\/www.klaushaller.net\/wp-content\/uploads\/2020\/01\/Kafka-Tutorial-Klaus-Haller-Part-1-2-300x147.png 300w\" sizes=\"auto, (max-width: 551px) 100vw, 551px\" \/><figcaption> Figure 1: Simple Kafka architecture with one Kafka node and Zookeeper instance as used in the first part of the tutorial. <\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">1.2 Prerequisite Java Run\nTime Environment<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Kafka requires a running Java runtime environment. You can type in \u201cJava -version\u201d in a command shell to verify this. As a result, you get the installed version number. This tutorial bases on java version &#8220;13.0.1&#8221; 2019-10-15. If the system returns the error message <code>'Java' is not recognized as an internal or external command, operable program or batch file.<\/code> you have to install a Java Runtime Environment or finish the installation by configuring path variables.  <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1.3 Download and Install Kafka\nFiles<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">You can download Kafka from this webpage: <a href=\"https:\/\/kafka.apache.org\/downloads\">https:\/\/kafka.apache.org\/downloads<\/a>. This tutorial bases on version 2.4.0 and binary build 2.13 from December 16<sup>th<\/sup>, 2010 and the name of the downloaded file is <code>kafka_2.13-2.4.0.tgz<\/code>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After the download is completed, unpack the file. You should now see a folder <code>kafka_2.13-2.4.0<\/code>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1.4 Configure and Start\nKafka<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">There are\ntwo configuration files, one for the Zookeeper instance and one for the\/a Kafka\nserver. The files are:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>kafka_2.13-2.4.0\\config\\zookeeper.properties<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">and<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>kafka_2.13-2.4.0\\config\\server.properties<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The only configuration you have to do is setting the path for all the log files in the <code>server.properties<\/code> file by defining the <code>log.dirs<\/code> property:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># A comma separated list of directories under which to store log files\nlog.dirs=C:\\Users\\yourusername\\kafka_2.13-2.4.0\\logs_server0<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">To get a better understanding of the Kafka installation, you might be interested in checking the parameters for the connection from the Kafka server to the Zookeeper. In the Zookeeper\u2019s property file, there is a parameter that defines on which port the Zookeeper is listening for Kafka servers:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># the port at which the clients will connect\nclientPort=2181<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The Kafka properties file defines where to connect to a Zookeper instance:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>zookeeper.connect=localhost:2181<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Thus, in case of this simple installation, the Kafka server will look for a Zookeeper on the local machine.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1.5 Start the Kafka\nInstallation<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We are now\nready to get our Kafka system up and running. This requires that we first start\nthe Zookeeper. This can be done as follows for a Windows system:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Open a new command shell<\/li><li>Change to the Kafka bin directory for Windows  <a><code>cd kafka_2.13-2.4.0\\bin\\windows<\/code><\/a><\/li><li> Start the Zookeeper  <code>zookeeper-server-start.bat ..\\..\\config\\zookeeper.properties<\/code>  <\/li><\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Note: If\nyou work with Linux, you use the .sh scripts in the \\bin folder instead. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It takes\nsome seconds until Zookeeper is up and running. You should see the following\ntext on the shell as Zookeeper output:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>INFO Using checkIntervalMs=60000 maxPerMinute=10000 (org.apache.zookeeper.server.ContainerManager)<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>INFO Creating new log file: log.9a (org.apache.zookeeper.server.persistence.FileTxnLog)<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Note.\nThe last line might only show up the first time you start Zookeeper.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With Zookeeper up and running, we can start now the Kafka server:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Open a new (!) command shell<\/li><li>Change to the Kafka bin directory for Windows <code><a>cd kafka_2.13-2.4.0\\bin\\windows<\/a> <\/code><\/li><li>Start the Kafka broker <code>kafka-server-start.bat ..\\..\\config\\server.properties<\/code><\/li><\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Now we just have to be sure that the server actually started. We should\nsee the following last line in the command shell:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)<\/code><br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1.6 Sending a Hello Kafka\nWorld Message<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In order to\nbe able to send our first message or event using Kafka, we need a topic to\nwhich consumers can subscribe to and receive messages that producers send for\nthis topic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We open a\nnew command shell in windows and run the following commands:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>cd kafka_2.13-2.4.0\\bin\\windows<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic myFirstChannel<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the same command shell, we start now a consumer service:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic myFirstChannel --from-beginning<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The consumer is now ready and waits for a message or event. In order to\nsend a message, we need a producer. Thus, we open another new command shell,\nthe fourth one, and start a simple producer process:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>cd kafka_2.13-2.4.0\\bin\\windows<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>kafka-console-producer.bat --broker-list localhost:9092 --topic myFirstChannel<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We now type in \u201cHello Kafka World!\u201d. Once we hit the return button, we\ncan see the message as well in the consumer window. We have a simple, running\nKafka installation and send and received a simple message. We are done with section\n1 of the tutorial: First Steps. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It is time to clean up before we begin with part 2 of the tutorial:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Keep\nthe Zookeeper and Kafka server command shell windows and keep the processes\nrunning.<\/li><li>Stop\nthe consumer and the producer applications and close the command shells.<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Click <a href=\"http:\/\/www.klaushaller.net\/?page_id=945\">here <\/a>to continue with the next part of the tutorial.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Part 1: \u201cFirst Steps\u201d: Overview &amp; Architecture In this tutorial, I provide a broad overview on the Kafka technology for development and operations as well covering the following steps. In the first tutorial, you have: A basic understanding of the Kafka architecture A single node Kafka installation up and running Using the command line interface, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":940,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-943","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/pages\/943","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=943"}],"version-history":[{"count":9,"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/pages\/943\/revisions"}],"predecessor-version":[{"id":966,"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/pages\/943\/revisions\/966"}],"up":[{"embeddable":true,"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=\/wp\/v2\/pages\/940"}],"wp:attachment":[{"href":"https:\/\/www.klaushaller.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=943"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}