In this guide, we will learn how to deploy a Kafka Streams application on a Kubernetes cluster. We will deploy both Kafka itself and a Kafka Streams application using Kubernetes manifests and Helm charts.
First, we need to deploy Kafka on our Kubernetes cluster. The easiest way to deploy Kafka is by using Helm charts.
We will use the Bitnami Kafka Helm chart to deploy Kafka:
# Add Bitnami Helm repository
helm repo add bitnami https://charts.bitnami.com/bitnami
# Update Helm repositories
helm repo update
# Install Kafka
helm install kafka bitnami/kafka --set replicaCount=1 --set zookeeper.replicaCount=1
This command installs a single-node Kafka and Zookeeper setup on the Kubernetes cluster.
We will now deploy the Kafka Streams application (similar to the ETL example) on Kubernetes using a Docker container and Kubernetes manifests.
Create a Dockerfile
to package the Kafka Streams application:
# Dockerfile for Kafka Streams Application
FROM openjdk:17-jdk-slim
COPY target/kafka-streams-etl.jar /app/kafka-streams-etl.jar
WORKDIR /app
CMD ["java", "-jar", "kafka-streams-etl.jar"]
Build the Docker image:
# Build the Docker image
docker build -t yourusername/kafka-streams-etl:latest .
Push the image to a container registry (e.g., Docker Hub or another registry):
# Push the image to Docker Hub
docker push yourusername/kafka-streams-etl:latest
Now, create a Kubernetes deployment and service YAML file for the Kafka Streams application:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kafka-streams-etl
labels:
app: kafka-streams-etl
spec:
replicas: 1
selector:
matchLabels:
app: kafka-streams-etl
template:
metadata:
labels:
app: kafka-streams-etl
spec:
containers:
- name: kafka-streams-etl
image: yourusername/kafka-streams-etl:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
env:
- name: KAFKA_BROKER
value: "kafka:9092"
- name: INPUT_TOPIC
value: "input-topic"
- name: OUTPUT_TOPIC
value: "output-topic"
---
apiVersion: v1
kind: Service
metadata:
name: kafka-streams-etl
spec:
selector:
app: kafka-streams-etl
ports:
- protocol: TCP
port: 8080
targetPort: 8080
Apply the Kubernetes manifests to deploy the Kafka Streams application:
# Apply the Kafka Streams application deployment
kubectl apply -f kafka-streams-deployment.yaml
This will create a pod that runs your Kafka Streams application and a service to expose it.
Check if the Kafka Streams application is running by listing the pods:
# List running pods
kubectl get pods
You should see a pod named kafka-streams-etl
in the list. You can check the logs of the Kafka Streams application using:
# View logs of Kafka Streams application
kubectl logs kafka-streams-etl
You can produce messages to the Kafka topic using Kafka console producer:
# Produce messages to Kafka topic
kubectl exec -it $(kubectl get pods | grep kafka | awk '{print $1}') -- kafka-console-producer.sh --broker-list kafka:9092 --topic input-topic
> message1
> message2
Then, consume messages from the output topic:
# Consume messages from Kafka output topic
kubectl exec -it $(kubectl get pods | grep kafka | awk '{print $1}') -- kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic output-topic --from-beginning
In this guide, we learned how to deploy a Kafka Streams application on a Kubernetes cluster. We used Helm to install Kafka and Kubernetes manifests to deploy our Kafka Streams ETL pipeline. This setup allows for scalable, resilient, real-time data streaming and transformation in a cloud-native environment.