Kafka Streams on Kubernetes

In this guide, we will learn how to deploy a Kafka Streams application on a Kubernetes cluster. We will deploy both Kafka itself and a Kafka Streams application using Kubernetes manifests and Helm charts.

Prerequisites

Step 1: Deploy Kafka on Kubernetes

First, we need to deploy Kafka on our Kubernetes cluster. The easiest way to deploy Kafka is by using Helm charts.

Using Helm to Install Kafka

We will use the Bitnami Kafka Helm chart to deploy Kafka:

# Add Bitnami Helm repository
helm repo add bitnami https://charts.bitnami.com/bitnami

# Update Helm repositories
helm repo update

# Install Kafka
helm install kafka bitnami/kafka --set replicaCount=1 --set zookeeper.replicaCount=1

This command installs a single-node Kafka and Zookeeper setup on the Kubernetes cluster.

Step 2: Deploy the Kafka Streams Application

We will now deploy the Kafka Streams application (similar to the ETL example) on Kubernetes using a Docker container and Kubernetes manifests.

Dockerizing the Kafka Streams Application

Create a Dockerfile to package the Kafka Streams application:

# Dockerfile for Kafka Streams Application

FROM openjdk:17-jdk-slim
COPY target/kafka-streams-etl.jar /app/kafka-streams-etl.jar
WORKDIR /app
CMD ["java", "-jar", "kafka-streams-etl.jar"]

Build the Docker image:

# Build the Docker image
docker build -t yourusername/kafka-streams-etl:latest .

Push the image to a container registry (e.g., Docker Hub or another registry):

# Push the image to Docker Hub
docker push yourusername/kafka-streams-etl:latest

Creating Kubernetes Deployment and Service for the Kafka Streams Application

Now, create a Kubernetes deployment and service YAML file for the Kafka Streams application:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kafka-streams-etl
  labels:
    app: kafka-streams-etl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kafka-streams-etl
  template:
    metadata:
      labels:
        app: kafka-streams-etl
    spec:
      containers:
      - name: kafka-streams-etl
        image: yourusername/kafka-streams-etl:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
        env:
        - name: KAFKA_BROKER
          value: "kafka:9092"
        - name: INPUT_TOPIC
          value: "input-topic"
        - name: OUTPUT_TOPIC
          value: "output-topic"
---
apiVersion: v1
kind: Service
metadata:
  name: kafka-streams-etl
spec:
  selector:
    app: kafka-streams-etl
  ports:
  - protocol: TCP
    port: 8080
    targetPort: 8080

Step 3: Deploying the Kafka Streams Application

Apply the Kubernetes manifests to deploy the Kafka Streams application:

# Apply the Kafka Streams application deployment
kubectl apply -f kafka-streams-deployment.yaml

This will create a pod that runs your Kafka Streams application and a service to expose it.

Step 4: Verifying the Deployment

Check if the Kafka Streams application is running by listing the pods:

# List running pods
kubectl get pods

You should see a pod named kafka-streams-etl in the list. You can check the logs of the Kafka Streams application using:

# View logs of Kafka Streams application
kubectl logs kafka-streams-etl

Step 5: Producing and Consuming Messages

You can produce messages to the Kafka topic using Kafka console producer:

# Produce messages to Kafka topic
kubectl exec -it $(kubectl get pods | grep kafka | awk '{print $1}') -- kafka-console-producer.sh --broker-list kafka:9092 --topic input-topic
> message1
> message2

Then, consume messages from the output topic:

# Consume messages from Kafka output topic
kubectl exec -it $(kubectl get pods | grep kafka | awk '{print $1}') -- kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic output-topic --from-beginning

Conclusion

In this guide, we learned how to deploy a Kafka Streams application on a Kubernetes cluster. We used Helm to install Kafka and Kubernetes manifests to deploy our Kafka Streams ETL pipeline. This setup allows for scalable, resilient, real-time data streaming and transformation in a cloud-native environment.