KLogic
Architecture Guide

Kafka Partition Strategy Guide

Master Kafka partitioning with comprehensive coverage of partition count selection, key strategies, rebalancing optimization, and scaling best practices for production workloads.

Published: December 12, 2025 • 16 min read • Architecture Guide

Why Partitioning Matters

Partitions are Kafka's unit of parallelism. They determine how data is distributed across brokers, how consumers can scale, and ultimately the throughput and latency characteristics of your streaming applications.

Parallelism

More partitions = more consumers processing in parallel

Throughput

Partition count directly impacts maximum achievable throughput

Ordering

Messages are ordered only within a partition, not across partitions

Choosing the Right Partition Count

Partition Count Formula

Partitions = max(Target Throughput / Producer Throughput per Partition,
                    Target Throughput / Consumer Throughput per Partition)

Start by measuring single-partition throughput for both producers and consumers, then calculate the minimum partitions needed to meet your target throughput.

Factors to Consider

  • Expected throughput:Plan for peak load, not average. Consider 2-3x headroom.
  • Consumer parallelism:Partitions should be >= max number of consumer instances you'll run.
  • Broker resources:Each partition has overhead. Aim for <4000 partitions per broker.
  • Future growth:Adding partitions later can be disruptive. Plan for 2-3 years ahead.

Warning: Too Many Partitions

While more partitions increase parallelism, they also increase memory usage, rebalance time, and end-to-end latency. Over-partitioning is a common mistake.

Partition Key Strategies

1. Entity-Based Keys

Use a business entity ID (user_id, order_id, device_id) as the partition key. This ensures all events for an entity go to the same partition, maintaining order.

# All events for user_123 go to the same partition
producer.send("user-events", key="user_123", value=event)
Best for: Event sourcing, user activity streams, order processing

2. Composite Keys

Combine multiple fields to create a more granular partitioning scheme. Useful when you need ordering within a subset of data.

# Partition by tenant + entity for multi-tenant systems
key = f"{tenant_id}:{user_id}"
producer.send("multi-tenant-events", key=key, value=event)
Best for: Multi-tenant systems, hierarchical data, complex ordering requirements

3. Round-Robin (No Key)

When ordering doesn't matter, send messages without a key for even distribution across partitions. Kafka will use sticky partitioning for batching efficiency.

# No key = round-robin distribution
producer.send("logs", value=log_entry)
Best for: Logs, metrics, events where order doesn't matter

4. Custom Partitioner

Implement a custom partitioner when the default hash-based partitioning doesn't meet your needs, such as geo-based routing or priority-based assignment.

# Route high-priority messages to specific partitions
def partition(key, partitions):
  if is_high_priority(key):
    return 0 # Dedicated partition for priority
  return hash(key) % (partitions - 1) + 1

Avoiding Hot Partitions

A "hot partition" occurs when one partition receives significantly more traffic than others, creating a bottleneck that limits overall throughput.

Common Causes

  • Skewed key distribution (power law, celebrity users)
  • Temporal keys that hash to same partition
  • Low cardinality keys with uneven frequency
  • Burst traffic from specific sources

Solutions

  • Add random suffix to hot keys for distribution
  • Use composite keys with high-cardinality component
  • Separate hot entities to dedicated topics
  • Monitor partition size metrics continuously

Scaling Partitions

Important Limitation

Kafka only supports increasing partitions, not decreasing. Adding partitions also breaks key-based ordering guarantees for existing keys. Plan carefully!

When to Add Partitions

  • Consumer lag is growing despite healthy consumers
  • You need more consumer parallelism than current partition count
  • Individual partition size is growing too large

Best Practices for Scaling

  • Double partitions when scaling (easier math, better distribution)
  • Scale during low-traffic periods to minimize rebalance impact
  • Update consumer instances before adding partitions
  • Monitor partition distribution after scaling

Monitor Partition Health with KLogic

KLogic provides real-time partition metrics including size distribution, throughput per partition, and automatic hot partition detection with alerting.

Start Free Trial