Shubham Sharma

Apache Kafka Deep Dive: Broker Internals and Best practices

Introduction

Apache Kafka was developed at LinkedIn in 2010 by engineers Jay Kreps, Neha Narkhede, and Jun Rao to address the company’s need for a scalable, low-latency system capable of handling vast amounts of real-time data. At the time, LinkedIn’s existing infrastructure struggled to efficiently process the growing volume of user activity events and operational metrics generated across its platform. The team aimed to create a system that could ingest and distribute this data in real-time, facilitating both immediate processing and long-term storage for analytics.

The result was Kafka, a distributed publish-subscribe messaging system designed around a commit log architecture. This design choice enabled Kafka to offer high throughput, fault tolerance, and scalability, making it well-suited for real-time data pipelines. Kafka was open-sourced in early 2011 and entered the Apache Incubator later that year, achieving top-level project status in 2012.

Recognizing Kafka’s potential beyond LinkedIn, its creators founded Confluent in 2014, a company dedicated to providing enterprise-grade solutions and support for Kafka deployments. Today, Apache Kafka is widely adopted across various industries for building real-time data pipelines and streaming applications.

Apache Kafka’s journey from a simple log aggregation tool to a core component of modern event-driven architectures is rooted in both its technical design and the evolution of data-driven application needs.

Core Concepts: Understanding Kafka’s Building Blocks

Alt text for the image

Producers and Consumers

Topics

A Kafka topic acts as a logical channel to which producers publish messages. Consumers subscribe to these topics to receive messages. Topics help organize data streams logically.

Partitions

Partitions are subdivisions within a topic, enabling parallelism and scaling. Kafka stores messages in these partitions, allowing multiple consumers to read data simultaneously, enhancing throughput.

Brokers

Kafka brokers are servers responsible for storing and managing data within partitions. They facilitate message reads and writes, handle replication, and ensure fault tolerance.

Internal Architecture: How Kafka Works Under the Hood

Message

The producer decides which partition to send to:

Examples

Message (Record)

Topic

Key (optional)

Value (Payload)

Headers (optional metadata)

Alt text for the image

Message Flow: From Producer to Broker

Socket Receive and Network Threads When a producer sends a message, it travels over the network and lands in the broker’s socket receive buffer. Kafka dedicates a network thread to each client connection; this thread handles the socket I/O for that client throughout its lifecycle, ensuring ordered processing of requests. The non-blocking I/O design allows the broker to multiplex dozens of client connections across a small number of threads, keeping each thread’s work light: reading request bytes, forming internal request objects, and enqueuing them for processing. Request Queue and I/O Threads Once the network thread enqueues a request, a pool of I/O threads picks up pending requests from a shared request queue. Each I/O thread validates the CRC checksum of the payload to ensure data integrity and then dispatches the record to the partition’s commit log for storage and replication.

Commit Log and Segments

Kafka’s core storage mechanism is the commit log, chosen for its simplicity, durability, and performance:

Each log is organized on disk into sequential segments:

Replication and Acknowledgment

By default, a broker acknowledges the producer only after the message is durably replicated to all in-sync replicas. During this replication window, the I/O thread adds the request to an in-memory map of pending acknowledgments (commonly called “purgatory”), allowing the thread to free itself for other work.

Once replication completes, the broker generates a produce response and enqueues it in the response queue. A network thread then picks up the response, writes it to the socket send buffer, and sends it back to the producer, completing the request–response cycle.

Ordered Processing

Kafka enforces strict ordering on a per-connection basis: a network thread processes only one request per client at a time and does not read the next request until it has successfully sent the response for the previous one. This mechanism guarantees that each producer’s messages are processed in the exact sequence they were sent.

Consumer Flow: From Fetch Request to Zero-Copy Response

Fetch Request: Topic, Partition & Offset

The consumer issues a fetch request specifying:

Socket Receive & Network Threads

The OS kernel places the incoming TCP packet into the broker’s socket receive buffer. A dedicated network thread reads one request at a time per client, preserving per-client ordering, and enqueues it into Kafka’s internal request queue.

I/O Threads & Index Lookup An I/O thread pulls the fetch request from the queue.

It uses the partition’s index file to map the requested offset (150) directly to the byte position in the log segment—skipping unnecessary file scans.

Long-Polling & Purgatory If no new data exists at that offset, Kafka adds the request to a purgatory map, waiting for: A minimum bytes threshold (e.g., 1 KB of new data), or A maximum wait time (e.g., 500 ms).

Once either condition is met, the request is resumed and a response generated.

Response Generation & Network Thread

The processed data is moved into a response object and queued for the network thread. The network thread then writes the response into the socket send buffer and returns it to the consumer.

Zero-Copy Transfer

Kafka leverages the OS sendfile syscall to copy data directly from the page cache (or log file) to the network socket, bypassing user-space copies:

Tiered Storage To prevent cold-data reads from degrading performance, Kafka can offload older log segments to a cheaper, external tier (e.g., object storage). Hot segments remain on local SSD for ultra-fast access, while cold segments are fetched on-demand without evicting critical pages or blocking main I/O threads.

Why Kafka’s Architecture Sets It Apart

Cluster Metadata Management: Zookeeper vs KRaft

Alt text for the image

In earlier Kafka versions, cluster metadata—broker lists, topic configurations, partition assignments, and leader elections—were all managed by an external Apache Zookeeper ensemble. While powerful, this introduced operational overhead:

Alt text for the image

Kafka’s new KRaft (Kafka Raft) mode embeds metadata management directly into Kafka brokers using the Raft consensus algorithm:

Important Design Choices and Tradeoffs

Partitioning Strategy

Replication Factor

Throughput vs. Latency Tuning

Adjusting configurations like data compression, batching messages, and acknowledgments can optimize either throughput or latency.

Common Mistakes and Anti-Patterns to Avoid

When Not to Use Kafka

While powerful, Kafka isn’t the solution for every scenario:

Understanding Kafka’s strengths and limitations can help build effective, scalable systems tailored to specific needs.