Apache Kafka Overview
PYTHON
Key Features of Apache Kafka:
- Distributed and Scalable: Kafka is designed to be distributed across multiple nodes, providing scalability to handle large volumes of data and high throughput.
- Fault Tolerance: Kafka is fault-tolerant, meaning it can continue to operate in the presence of node failures without losing data. It achieves fault tolerance through data replication across multiple brokers.
- Durability: Kafka provides persistent storage of messages, ensuring that data is not lost even if a consumer is not able to process it immediately.
- High Throughput: Kafka is capable of handling a high volume of data streams and can process thousands of messages per second.
- Partitioning: Kafka topics are divided into partitions, allowing for parallel processing and scalability. Each partition is an ordered, immutable sequence of messages.
- Retention: Kafka allows you to configure the retention period for messages, determining how long messages are stored in a topic.
- Exactly-Once Semantics: Kafka supports exactly-once message delivery semantics, ensuring that messages are neither lost nor duplicated during processing.
- Connectivity: Kafka has a variety of connectors (source and sink) for integrating with various data sources and sinks, making it versatile in connecting with other systems.
Example Kafka Configuration:
# server.properties
# Kafka Broker ID
broker.id=1
# Port the broker listens on
listeners=PLAINTEXT://localhost:9092
# Log storage directory
log.dirs=/tmp/kafka-logs
# Number of partitions for new topics
num.partitions=3
# Replication factor for topics
default.replication.factor=2
# ZooKeeper connection string
zookeeper.connect=localhost:2181
Basic Kafka Usage:
1. Starting Kafka Server:
bin/kafka-server-start.sh config/server.properties
2. Creating a Topic:
bin/kafka-topics.sh --create --topic myTopic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 2
3. Producing Messages:
bin/kafka-console-producer.sh --topic myTopic --bootstrap-server localhost:9092
4. Consuming Messages:
bin/kafka-console-consumer.sh --topic myTopic --bootstrap-server localhost:9092 --from-beginning