Kafka: Creating and Configuring Topics

In Kafka, topics are the basic unit for organizing data. Producers write data to topics, and consumers read from topics. Topics are divided into partitions, and each partition can be replicated for fault tolerance. Here, we’ll go through the process of creating and configuring topics in Kafka.

1. Creating a Kafka Topic

You can create a Kafka topic using the kafka-topics.sh command. The command allows you to specify the topic name, number of partitions, and replication factor. Here’s an example:


# Syntax for creating a topic
kafka-topics.sh --create --topic my_topic \
    --partitions 3 --replication-factor 2 \
    --bootstrap-server localhost:9092
    

Explanation of parameters:

2. Listing Kafka Topics

After creating a topic, you can list all available topics in the Kafka cluster using the following command:


# List all topics in the cluster
kafka-topics.sh --list --bootstrap-server localhost:9092
    

3. Describing a Kafka Topic

To get more details about a specific topic, including partition count, replication factor, and the leaders of each partition, use the --describe option:


# Describe a specific topic
kafka-topics.sh --describe --topic my_topic \
    --bootstrap-server localhost:9092
    

Output example:


Topic: my_topic  PartitionCount: 3  ReplicationFactor: 2  Configs: 
    Partition: 0  Leader: 1  Replicas: 1,2  Isr: 1,2
    Partition: 1  Leader: 2  Replicas: 2,3  Isr: 2,3
    Partition: 2  Leader: 3  Replicas: 3,1  Isr: 3,1
    

4. Configuring Topic Parameters

Kafka topics can have additional configurations, such as retention period, cleanup policy, and min/ max in-sync replicas. You can set or update these configurations using the kafka-configs.sh command. Below are common configurations:

Example: Retention Period

To configure the retention period for a topic (how long Kafka keeps data before it’s deleted), use the following command:


# Set retention period to 7 days (in milliseconds)
kafka-configs.sh --alter --entity-type topics --entity-name my_topic \
    --add-config retention.ms=604800000 \
    --bootstrap-server localhost:9092
    

Example: Cleanup Policy

You can set the cleanup policy to either delete (delete old data) or compact (log compaction). Here’s how to set log compaction:


# Set log compaction for the topic
kafka-configs.sh --alter --entity-type topics --entity-name my_topic \
    --add-config cleanup.policy=compact \
    --bootstrap-server localhost:9092
    

5. Deleting a Kafka Topic

If you no longer need a topic, you can delete it using the kafka-topics.sh command:


# Delete a Kafka topic
kafka-topics.sh --delete --topic my_topic \
    --bootstrap-server localhost:9092
    

Note that topic deletion must be enabled in Kafka’s configuration file by setting delete.topic.enable=true in server.properties.

6. Topic Configuration Best Practices

7. Conclusion

Understanding how to create and configure Kafka topics is essential for optimizing Kafka's performance and managing data retention. By using the right partition and replication settings, you can ensure scalability and fault tolerance in your Kafka environment.