Apache Kafka Essential Commands and Concepts

- 7 mins

Kafka Topics and Essential Commands

Apache Kafka is a powerful distributed event streaming platform designed for high-throughput and fault-tolerant data processing. Whether setting up a single-node instance for development or configuring a multi-node cluster for production, understanding Kafka’s core commands is crucial.

This post breaks down the following topics:

Zookeeper

Apache Zookeeper is a distributed coordination service that helps manage and synchronize distributed applications like Kafka. It provides features like:

So zookeeper is a kind of database where Kafka brokers store a bunch of shared information. It is used as a shared system among multiple Kafka brokers to coordinate among themselves for various things. Kafka needs zookeeper for coordinating things among the brokers, and you must have it running even if you have got a single broker.

⚠️ Note: Kafka no longer requires Zookeeper in newer versions (Kafka KRaft mode replaces it). 🔹 If using Kafka < 3.0, you must use Zookeeper. 🔹 If using Kafka 3.0+, you can use Kafka KRaft mode (Zookeeper-free mode).

Kafka Startup

The Command:

bin\windows\kafka-server-start.bat config\server.properties

This command launches a Kafka server (broker) with the settings specified in server.properties. Let’s analyze each part of it.

1. bin\windows\

This is the directory where Kafka’s Windows scripts are stored. On Linux/macOS, you would typically use the bin/ directory instead.

2. kafka-server-start.bat

This is the batch script that starts the Kafka broker. Internally, it:

3. config\server.properties

This is the configuration file that defines important settings for the Kafka broker, including:

What Happens When You Run the Command?

  1. Kafka starts using the settings in server.properties.
  2. If using Zookeeper mode, the broker registers itself with Zookeeper.
  3. Kafka opens its listener port (default: 9092) for clients to connect.
  4. It creates or recovers partitions and topics from its logs.
  5. The Kafka server is now ready to handle producer and consumer requests.

Configuring Multiple Brokers

When starting a Kafka broker, we supply the server.properties file as an argument. The kafka-server-start command reads configurations from this file. If we plan to run multiple brokers, we need to:

  1. Make copies of server.properties – Each broker requires a separate configuration file with a unique name.
  2. Modify essential configurations – Each broker must have unique settings.

Key Configuration Changes

1. Unique Broker ID
broker.id=0  # Must be unique for each broker

Each broker needs a distinct broker.id, which uniquely identifies it within the cluster.

2. Listener Port
listeners=PLAINTEXT://:9092
3. Log Directory
log.dirs=/path/to/kafka-logs-0

Each broker needs a separate log directory to prevent conflicts. Example:

Running Kafka on Multiple Machines

When running multiple brokers on different machines, the only required change is assigning a unique broker.id. Kafka can also be configured to auto-assign IDs, eliminating the need for manual changes.

By correctly configuring multiple brokers, you can create a robust, scalable Kafka cluster ready to handle high-throughput data streams.


Managing Kafka Topics

Once Kafka is running, you need to create topics for storing and processing messages. The following command creates a new topic:

bin\windows\kafka-topics.bat --create --topic my-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

Command Breakdown

1. bin\windows\kafka-topics.bat

This script is used to manage Kafka topics on Windows. It’s located in the Kafka installation directory under bin\windows.

2. --create

This flag tells Kafka to create a new topic.

3. --topic my-topic

Specifies the name of the topic to be created (my-topic in this case).

4. --partitions 1

Defines the number of partitions for the topic.

5. --replication-factor 1

Defines the number of copies (replicas) of the topic’s data across different brokers.

6. --bootstrap-server localhost:9092

Specifies the Kafka broker(s) to connect to for topic creation.

Other Useful Commands

List all topics

bin\windows\kafka-topics.bat --list --bootstrap-server localhost:9092

This command lists all available topics in the Kafka cluster.

Describe a topic (Check details)

bin\windows\kafka-topics.bat --describe --topic my-topic --bootstrap-server localhost:9092

This command provides detailed information about a topic, such as its partitions, leader, replicas, and in-sync replicas (ISR).


Produce Events (Write to a Topic)

bin\windows\kafka-console-producer.bat --topic my-topic --bootstrap-server localhost:9092 <..\data\file.csv>

Command Breakdown


Consume Events (Read from a Topic)

bin\windows\kafka-console-consumer.bat --topic my-topic --from-beginning --bootstrap-server localhost:9092

Command Breakdown

These commands help manage data flow within Kafka by producing and consuming messages efficiently.

Consumer Groups

Consumers reading from the same topic can share the load.

Example:

The data goes to the Kafka topic. Since the topic is partitioned, all the data will be distributed among the three partitions. Some records will go to the first broker in the first partition, while others will be distributed across the other two brokers and partitions.

However, since we have 2 consumers in the group and 3 partitions, one consumer will read from 2 partitions, while the other will read from the remaining partition. This ensures all data is processed efficiently.

Lais Ziegler

Lais Ziegler

Dev in training... 👋