Kafka Hands-On Candidate Assignment
Task 1: Kafka Cluster Setup
1. Scenario: You're tasked with setting up a Kafka cluster with the following requirements:
o Minimum of 3 brokers.
o 1 Zookeeper node (if using Kafka versions requiring Zookeeper).
o The cluster should be configured for high availability and fault tolerance.
2. Deliverables:
o A step-by-step guide explaining how you set up the Kafka cluster, including all
necessary configurations (log retention, replication factor, partitioning, etc.).
o Provide scripts or automation you used for setting up the cluster (if any, such as
Ansible, Terraform, Docker, Kubernetes).
o Include screenshots showing the running Kafka brokers and Zookeeper instances.
Task 2: Kafka Topic Configuration and Data Flow
1. Scenario: Create two Kafka topics:
o Topic A: Replication factor of 3, with 6 partitions.
o Topic B: Replication factor of 1, with 2 partitions.
After creating the topics, simulate a simple producer-consumer data flow:
o Write a script or program (in any language) to produce 1,000 messages to Topic A
and Topic B.
o Implement consumers for both topics that read and log the messages.
Connect with database (PostgreSQL/Oracle) with the connectors and read data from different tables,
aggregate the data and provide it to consumers through topics.
2. Deliverables:
o Kafka topic creation commands with appropriate configurations.
o Producer and consumer code (or scripts).
o Screenshots showing the successful creation of topics, producers sending data, and
consumers consuming data.
Task 3: Kafka Monitoring and Performance Tuning through the Control centre
1. Scenario: You need to monitor the Kafka cluster and optimize its performance. Assume your
Kafka cluster is under heavy load, and you notice latency issues with consumers. Implement
the following:
o Enable Kafka metrics for broker performance (e.g., disk I/O, network throughput).
o Set up an alert system for key performance metrics using tools like Prometheus,
Grafana, or JMX exporters.
o Tune the Kafka cluster configuration (e.g., log.retention.ms, num.io.threads,
message.max.bytes) to improve performance under load.
2. Deliverables:
o A list of the key Kafka metrics you chose to monitor and why.
o Instructions or scripts to set up the monitoring system.
o Configuration changes and explanations for performance improvements.
o Screenshots of the monitoring dashboard displaying Kafka broker health and key
metrics.
Task 4: Kafka Troubleshooting
1. Scenario: During production, one of your Kafka brokers goes down, and messages stop being
consumed from a partition. As a Kafka engineer, you need to identify and resolve the issue.
Steps:
o Investigate why the broker went down (logs, errors, etc.).
o Recover the broker and ensure the messages in the affected partition are consumed
correctly.
o Document any partition reassignment or replication steps you took.
2. Deliverables:
o A summary of the issue, logs, and root cause.
o Detailed steps for recovering the broker.
o Steps or commands used for partition reassignment, if applicable.
o Evidence (logs, screenshots) showing that the partition is back online and messages
are being consumed.
Task 5: Kafka Security
1. Scenario: Secure your Kafka cluster by:
o Enabling SSL encryption for communication between brokers and between clients
(producers/consumers) and brokers.
o Configuring simple authentication using SASL (Simple Authentication and Security
Layer).
2. Deliverables:
o Step-by-step instructions for enabling SSL and SASL.
o Configuration files (with sensitive data redacted).
o Screenshots showing secure communication between clients and the Kafka cluster.
Submission Guidelines:
Ensure that all deliverables are organized in a single document or repository (preferably a
GitHub repo).
Include any scripts, code, and screenshots as part of the submission.
Document each step clearly, and provide explanations for any decisions or configurations
made.
Deadline: [ 3-4 days].