0% found this document useful (0 votes)

25 views3 pages

020.04 - Kafka Architecture

Apache Kafka is a distributed streaming platform designed for real-time data pipelines and applications, featuring a publish-subscribe messaging pattern. Its architecture includes core components such as brokers, topics, partitions, producers, and consumers, ensuring scalability, reliability, and fault-tolerance through replication and offset management. Advanced features like Kafka Connect, Kafka Streams, and a Schema Registry enhance its integration and processing capabilities.

Uploaded by

Samrat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views3 pages

020.04 - Kafka Architecture

Uploaded by

Samrat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

Apache Kafka is a distributed streaming platform used for building real-time data

pipelines and streaming applications. It follows a publish-subscribe messaging

pattern and is known for its scalability, reliability, and fault-tolerance. Here’s
a detailed look at Kafka's architecture:

---

### 1. Core Components

Kafka’s architecture includes the following core components:

#### a. **Broker**
- A **Kafka broker** is a server that stores data and serves client requests.
- Kafka is designed to be distributed, so a **cluster** consists of multiple
brokers.
- Each broker is identified by a unique **ID**.
- Brokers handle:
- **Message storage**: Persisting data on disk.
- **Message retrieval**: Serving producer and consumer requests.

#### b. **Topic**
- A **topic** is a category or stream to which records are sent.
- **Producers** write data to topics, and **consumers** read from them.
- Topics are:
- **Partitioned** for scalability.
- **Replicated** for fault-tolerance.

#### c. **Partition**
- Each topic is divided into one or more **partitions**.
- A **partition** is a log file that stores messages in an **append-only** manner.
- Messages in a partition have a sequential **offset**.
- Partitioning provides parallelism by spreading data across brokers.

#### d. **Producer**
- Producers are clients that publish messages to Kafka topics.
- Producers:
- Choose the partition (round-robin, key-based).
- Write data asynchronously for high throughput.

#### e. **Consumer**
- Consumers are clients that read messages from Kafka topics.
- They use **consumer groups**:
- A group ensures only one consumer in the group reads from a partition.
- Multiple consumers in a group can process different partitions in parallel.

#### f. Zookeeper/Quorum Controller

- Previously, **ZooKeeper** managed the Kafka cluster (e.g., leader election,
metadata).
- With newer versions, Kafka has introduced the **Quorum Controller** to eliminate
ZooKeeper dependencies.
- This simplifies management and enhances scalability.

#### g. **Replication**
- Kafka ensures data availability via **replication**.
- Each partition has one **leader** and multiple **followers**.
- The leader handles all read/write requests.
- Followers replicate the leader’s data and take over if the leader fails.

---
### 2. **Key Features**
#### a. **Log-Based Storage**
- Kafka stores messages as logs.
- Each partition maintains an immutable sequence of messages.

#### b. Offset Management

- Each message in a partition is assigned a unique **offset**.
- Consumers keep track of their progress using these offsets.

#### c. **Durability**
- Kafka persists data to disk, ensuring reliability.
- Configurable **retention policies** allow users to control how long messages are
stored.

#### d. High Throughput

- Kafka achieves high throughput by batching and compressing data.

#### e. **Scalability**
- Adding brokers and partitions enables horizontal scaling.

---

### 3. Data Flow in Kafka

#### a. **Producer Workflow**
1. Producers send records to a topic.
2. Partitions are chosen based on:
- A specified key.
- Round-robin distribution.
3. The broker writes records to the chosen partition.

#### b. Broker Workflow

1. Messages are stored in partitions on the broker.
2. The leader broker replicates data to follower brokers.
3. Metadata (e.g., topic configuration) is shared among brokers.

#### c. Consumer Workflow

1. Consumers poll the broker for new messages.
2. Each consumer in a group gets assigned specific partitions.
3. Consumers commit their offsets to track consumption progress.

---

### 4. Kafka Cluster Example

```
[ Producer ] ----> [ Kafka Broker Cluster ] ----> [ Consumer Group ]
(Partitions spread across Brokers)
```

- Producer writes data to topic `topic-A`.

- Topic `topic-A` is divided into three partitions: `P0`, `P1`, `P2`.
- In a cluster:
- Broker 1 might handle `P0` (leader), replicate `P1` (follower).
- Broker 2 might handle `P1` (leader), replicate `P2` (follower).
- Broker 3 might handle `P2` (leader), replicate `P0` (follower).

---

### 5. Advanced Features

#### a. **Kafka Connect**
- Used to integrate Kafka with other systems (databases, storage, etc.).

#### b. Kafka Streams

- A lightweight library for processing data streams.

#### c. Schema Registry

- Manages message schemas (e.g., Avro) for better compatibility.

#### d. **Security**
- Kafka supports:
- **Authentication** (SASL, Kerberos).
- **Encryption** (SSL/TLS).
- **Authorization** (ACLs).

---

Kafka
No ratings yet
Kafka
140 pages
Kafka
No ratings yet
Kafka
12 pages
Kafka 11
No ratings yet
Kafka 11
5 pages
Kafka 55
No ratings yet
Kafka 55
2 pages
Kafka 33
No ratings yet
Kafka 33
2 pages
Apache Kafka
No ratings yet
Apache Kafka
8 pages
Kafka Interview Preparation
No ratings yet
Kafka Interview Preparation
13 pages
Kafka & Spring Boot for Developers
No ratings yet
Kafka & Spring Boot for Developers
150 pages
020.09 - Kafka Cluster
No ratings yet
020.09 - Kafka Cluster
3 pages
5.hmt-B19162a-M02 - Piping Diagram of Ballast Water System - 1.0
No ratings yet
5.hmt-B19162a-M02 - Piping Diagram of Ballast Water System - 1.0
6 pages
Kafka 1
No ratings yet
Kafka 1
2 pages
Kafka
No ratings yet
Kafka
15 pages
5 Kafka 2.7m
No ratings yet
5 Kafka 2.7m
46 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
25 pages
Kafka
No ratings yet
Kafka
4 pages
Kafkha
No ratings yet
Kafkha
32 pages
Kafka 22
No ratings yet
Kafka 22
2 pages
020.05 - Kafka Topics
No ratings yet
020.05 - Kafka Topics
3 pages
Kafka
No ratings yet
Kafka
1 page
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
AK
No ratings yet
AK
22 pages
020.07 - Kafka Brokers
No ratings yet
020.07 - Kafka Brokers
4 pages
Apache Kafka Key Concepts
100% (1)
Apache Kafka Key Concepts
8 pages
Kafka Arch
No ratings yet
Kafka Arch
4 pages
w17 Kafka Runningnotes-210309-183000
No ratings yet
w17 Kafka Runningnotes-210309-183000
20 pages
020.06 - Kafka Partitions
No ratings yet
020.06 - Kafka Partitions
3 pages
020.08 - Kafka Producers and Consumers
No ratings yet
020.08 - Kafka Producers and Consumers
4 pages
Unit 5 Apache Kafka Notes
No ratings yet
Unit 5 Apache Kafka Notes
54 pages
Apache Kafka
No ratings yet
Apache Kafka
10 pages
Kafka in Depth
No ratings yet
Kafka in Depth
15 pages
Kafka Notes 20250814
No ratings yet
Kafka Notes 20250814
6 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
Kafka 2
No ratings yet
Kafka 2
2 pages
Pache Kafka Is An Open-Source Distr
No ratings yet
Pache Kafka Is An Open-Source Distr
1 page
Kafka for Developers and Engineers
No ratings yet
Kafka for Developers and Engineers
7 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Some Special Terms in Kafka
No ratings yet
Some Special Terms in Kafka
10 pages
Kafka Architecture
No ratings yet
Kafka Architecture
5 pages
Kafka Notes1
No ratings yet
Kafka Notes1
19 pages
Kafka 1
No ratings yet
Kafka 1
2 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
26 pages
Apache Kafka Notes
No ratings yet
Apache Kafka Notes
11 pages
Apache Kafka
No ratings yet
Apache Kafka
6 pages
Big Data-Kafka
No ratings yet
Big Data-Kafka
14 pages
Kafka & Confluent: A Technical Guide
No ratings yet
Kafka & Confluent: A Technical Guide
72 pages
Kafka's Architecture: Find Answers On The Fly, or Master Something New. Subscribe Today
No ratings yet
Kafka's Architecture: Find Answers On The Fly, or Master Something New. Subscribe Today
1 page
TCS NQT Prep Guide
No ratings yet
TCS NQT Prep Guide
156 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Kafka Interview Guide
No ratings yet
Kafka Interview Guide
4 pages
Kafka Learning
No ratings yet
Kafka Learning
4 pages
DISD SD380 Wheel Loader Specs PDF
No ratings yet
DISD SD380 Wheel Loader Specs PDF
8 pages
(Susol Busway) - Catalog - EN - 202103
No ratings yet
(Susol Busway) - Catalog - EN - 202103
40 pages
Kafka
No ratings yet
Kafka
23 pages
Kafka Notes2
No ratings yet
Kafka Notes2
19 pages
School Memorandum No.22, S. 2020 ICT Training For Teachers
No ratings yet
School Memorandum No.22, S. 2020 ICT Training For Teachers
3 pages
1 Company Presentation 16 9
No ratings yet
1 Company Presentation 16 9
48 pages
Resume 1
No ratings yet
Resume 1
1 page
Kafka
No ratings yet
Kafka
45 pages
Kafka
No ratings yet
Kafka
5 pages
How To Test A Power Supply Unit - Corsair
No ratings yet
How To Test A Power Supply Unit - Corsair
1 page
Agip GR SLL 00
No ratings yet
Agip GR SLL 00
1 page
Configuring Kafka For High Throughput
No ratings yet
Configuring Kafka For High Throughput
11 pages
Kafka for Big Data Professionals
No ratings yet
Kafka for Big Data Professionals
14 pages
Module 10 Logic
No ratings yet
Module 10 Logic
10 pages
008.2 - Real-Time and Streaming Systems
No ratings yet
008.2 - Real-Time and Streaming Systems
2 pages
008 - Classification of Real Time Systems
No ratings yet
008 - Classification of Real Time Systems
2 pages
Revision Questions
No ratings yet
Revision Questions
2 pages
007 - Big Data Architecture Style
No ratings yet
007 - Big Data Architecture Style
3 pages
Understanding Apache Kafka White Paper
No ratings yet
Understanding Apache Kafka White Paper
7 pages
Space Systems - Responsive Missions
No ratings yet
Space Systems - Responsive Missions
2 pages
010.4 - Streaming Data Sources
No ratings yet
010.4 - Streaming Data Sources
2 pages
Rtu PDF
No ratings yet
Rtu PDF
13 pages
Ec2 2025
No ratings yet
Ec2 2025
1 page
Ec2 Regular Old
No ratings yet
Ec2 Regular Old
14 pages
LOREAL 2023 Universal Registration Document en
No ratings yet
LOREAL 2023 Universal Registration Document en
450 pages
Electrolux 102255 User Manual
No ratings yet
Electrolux 102255 User Manual
2 pages
ASSIGNMENT - WEEK-2 A.Multiple Choice Questions - Choose The Correct Answer/S (1X10 10)
No ratings yet
ASSIGNMENT - WEEK-2 A.Multiple Choice Questions - Choose The Correct Answer/S (1X10 10)
2 pages
CS 10 Designing Reliable Microservice
No ratings yet
CS 10 Designing Reliable Microservice
40 pages
CS 07 Communication and Transaction Management
No ratings yet
CS 07 Communication and Transaction Management
39 pages
Rheotherm: Circulating Water Flow and Fouling Sensor
No ratings yet
Rheotherm: Circulating Water Flow and Fouling Sensor
2 pages
CS 11 Securing and Testing Scalable Services
No ratings yet
CS 11 Securing and Testing Scalable Services
34 pages
CV - Inderpreet Kaur
No ratings yet
CV - Inderpreet Kaur
2 pages
018 - Features of Real-Time Architecture
No ratings yet
018 - Features of Real-Time Architecture
2 pages
012.2 - Pros and Cons of Lambda Architecture
No ratings yet
012.2 - Pros and Cons of Lambda Architecture
2 pages
016.21 - Split Brain Problem
No ratings yet
016.21 - Split Brain Problem
2 pages
011.2 - Streaming Data System Architecture Components - Data Flow Tier
No ratings yet
011.2 - Streaming Data System Architecture Components - Data Flow Tier
2 pages
011.5 - Streaming Data System Architecture Components - Delivery Tier
No ratings yet
011.5 - Streaming Data System Architecture Components - Delivery Tier
2 pages
009.1 - Why Is Stream Processing Needed
No ratings yet
009.1 - Why Is Stream Processing Needed
2 pages
006.1 - Properties of Data
No ratings yet
006.1 - Properties of Data
2 pages
006.2 - Fact Based Model For Data
No ratings yet
006.2 - Fact Based Model For Data
2 pages
003.3 - Maintainability
No ratings yet
003.3 - Maintainability
2 pages
003.1 - Reliability
No ratings yet
003.1 - Reliability
2 pages
016.2 - Distributed State Management
No ratings yet
016.2 - Distributed State Management
3 pages
019.2 - Data Delivery Semantic
No ratings yet
019.2 - Data Delivery Semantic
3 pages
011.3 - Streaming Data System Architecture Components - Processing Tier
No ratings yet
011.3 - Streaming Data System Architecture Components - Processing Tier
3 pages
009.4 - Traditional Vs Streaming Systems Data Models
No ratings yet
009.4 - Traditional Vs Streaming Systems Data Models
3 pages
CS 12 Deploying Microservices
No ratings yet
CS 12 Deploying Microservices
19 pages
003.2 - Scalability
No ratings yet
003.2 - Scalability
3 pages
017 - Apache ZooKeeper
No ratings yet
017 - Apache ZooKeeper
4 pages
017.2 - ZooKeeper Internals
No ratings yet
017.2 - ZooKeeper Internals
6 pages
EC2 Makeup Old
No ratings yet
EC2 Makeup Old
10 pages
NTC Type SMD: Thermometrics Surface Mount Devices
No ratings yet
NTC Type SMD: Thermometrics Surface Mount Devices
8 pages
Cheatsheet
No ratings yet
Cheatsheet
3 pages
Snowflake Adapter For SAP Integration Suite
No ratings yet
Snowflake Adapter For SAP Integration Suite
41 pages
IDELA Training Manual - Baseline II
No ratings yet
IDELA Training Manual - Baseline II
30 pages
Grid Code Compliance AGC4 MK II
No ratings yet
Grid Code Compliance AGC4 MK II
4 pages
EGEC 2023 Self Placement Guide
No ratings yet
EGEC 2023 Self Placement Guide
4 pages
M. Ed #RD Teacher Education - I
No ratings yet
M. Ed #RD Teacher Education - I
78 pages
Linux Kernel Module Basics
No ratings yet
Linux Kernel Module Basics
35 pages
Remote Sensing - Detecting Moving Trucks On Roads Using Sentinel-2 Data
No ratings yet
Remote Sensing - Detecting Moving Trucks On Roads Using Sentinel-2 Data
28 pages
LED Driver IC for Lighting Systems
No ratings yet
LED Driver IC for Lighting Systems
13 pages
Road Restraint Systems Guide
No ratings yet
Road Restraint Systems Guide
82 pages
Ericsson Supply Chain
No ratings yet
Ericsson Supply Chain
178 pages

020.04 - Kafka Architecture

Uploaded by

020.04 - Kafka Architecture

Uploaded by

Apache Kafka is a distributed streaming platform used for building real-time data

pipelines and streaming applications. It follows a publish-subscribe messaging

### 1. **Core Components**

#### f. **Zookeeper/Quorum Controller**

#### b. **Offset Management**

#### d. **High Throughput**

### 3. **Data Flow in Kafka**

#### b. **Broker Workflow**

#### c. **Consumer Workflow**

### 4. **Kafka Cluster Example**

- **Producer** writes data to topic `topic-A`.

### 5. **Advanced Features**

#### b. **Kafka Streams**

#### c. **Schema Registry**

You might also like

### 1. Core Components

#### f. Zookeeper/Quorum Controller

#### b. Offset Management

#### d. High Throughput

### 3. Data Flow in Kafka

#### b. Broker Workflow

#### c. Consumer Workflow

### 4. Kafka Cluster Example

- Producer writes data to topic `topic-A`.

### 5. Advanced Features

#### b. Kafka Streams

#### c. Schema Registry