KLogic
Cluster Setup Guide

Kafka Cluster Configuration

Master Kafka cluster configuration with this comprehensive guide. Learn essential broker settings, topic configurations, and production-ready setup strategies for reliable streaming infrastructure.

Published: January 10, 2026 • 20 min read • Configuration Guide

Kafka Cluster Architecture Basics

A Kafka cluster consists of multiple brokers working together to provide fault-tolerant, scalable message streaming. Proper cluster configuration is essential for performance, reliability, and operational efficiency.

Cluster Sizing Guidelines

Development

1-3 brokers, minimal resources

Staging

3-5 brokers, mirrors production

Production

5+ brokers, high availability

Essential Broker Configuration

Core Broker Settings

# server.properties - Essential settings broker.id=0 listeners=PLAINTEXT://0.0.0.0:9092 advertised.listeners=PLAINTEXT://broker-1.example.com:9092 # Cluster coordination (KRaft mode - Kafka 3.x+) process.roles=broker,controller node.id=1 controller.quorum.voters=1@controller-1:9093,2@controller-2:9093,3@controller-3:9093 # Or ZooKeeper mode (legacy) # zookeeper.connect=zk-1:2181,zk-2:2181,zk-3:2181/kafka

Each broker needs a unique broker.id and properly configured listeners for client connections.

Log and Storage Settings

# Log directories - use multiple disks for performance log.dirs=/data/kafka-logs-1,/data/kafka-logs-2 # Retention settings log.retention.hours=168 # 7 days default log.retention.bytes=-1 # No size limit (use time-based) log.segment.bytes=1073741824 # 1GB segments # Cleanup log.cleanup.policy=delete # or compact, or delete,compact log.cleaner.enable=true

Replication Settings

# Default replication for new topics default.replication.factor=3 min.insync.replicas=2 # Replication performance num.replica.fetchers=4 replica.fetch.max.bytes=1048576 replica.socket.receive.buffer.bytes=65536 # Leadership auto.leader.rebalance.enable=true leader.imbalance.check.interval.seconds=300 leader.imbalance.per.broker.percentage=10

For production, always use replication.factor=3 andmin.insync.replicas=2 for durability.

Network and Threading

# Network threads num.network.threads=8 # Handles network requests num.io.threads=16 # Handles disk I/O # Socket settings socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 # 100MB # Request handling queued.max.requests=500 request.timeout.ms=30000

Topic Configuration Best Practices

Creating Production Topics

# Create topic with production settings kafka-topics.sh --create \ --bootstrap-server localhost:9092 \ --topic orders \ --partitions 12 \ --replication-factor 3 \ --config min.insync.replicas=2 \ --config retention.ms=604800000 \ --config segment.bytes=1073741824

Partition Count Guidelines

ThroughputPartitionsConsiderations
Low (<10MB/s)3-6Minimum for HA with 3 brokers
Medium (10-100MB/s)6-24Match to consumer parallelism
High (100MB-1GB/s)24-100Scale with brokers and consumers
Very High (>1GB/s)100+Consider multiple topics

Warning: Partition Count is Hard to Change

Increasing partitions after creation can break key-based ordering guarantees. Plan partition counts carefully based on expected throughput growth. It's better to over-provision initially than to increase later.

Hardware Configuration

CPU

  • • 8-16 cores per broker (production)
  • • Kafka is not CPU-intensive normally
  • • More cores needed for compression/encryption
  • • Prefer higher clock speed over more cores

Memory

  • • 32-64GB RAM minimum for production
  • • 6-8GB for JVM heap (don't go higher)
  • • Rest used for OS page cache
  • • Page cache is critical for performance

Storage

  • • SSDs strongly recommended (NVMe preferred)
  • • Multiple disks with JBOD (no RAID)
  • • Size: retention period × throughput × replication
  • • XFS filesystem recommended

Network

  • • 10Gbps minimum for production
  • • Low latency between brokers (<1ms)
  • • Separate network for replication (optional)
  • • Consider cross-AZ bandwidth costs

JVM Configuration

Recommended JVM Settings

# KAFKA_HEAP_OPTS export KAFKA_HEAP_OPTS="-Xms6g -Xmx6g" # KAFKA_JVM_PERFORMANCE_OPTS export KAFKA_JVM_PERFORMANCE_OPTS="-server \ -XX:+UseG1GC \ -XX:MaxGCPauseMillis=20 \ -XX:InitiatingHeapOccupancyPercent=35 \ -XX:+ExplicitGCInvokesConcurrent \ -XX:+ParallelRefProcEnabled \ -XX:+DisableExplicitGC \ -Djava.awt.headless=true"

Keep heap size at 6-8GB. Larger heaps lead to longer GC pauses. Kafka relies heavily on OS page cache, not JVM heap.

Pro Tip: Monitor GC Metrics

Enable GC logging with -Xlog:gc*:file=/var/log/kafka/gc.log:time,tags:filecount=10,filesize=100Mand monitor for pause times exceeding 100ms. Consider ZGC for Java 17+ deployments.

Kafka Cluster Setup Checklist

Minimum 3 brokers for production

Enables replication factor of 3 with fault tolerance.

Brokers in different failure domains

Use rack awareness to spread replicas across AZs or racks.

Configure min.insync.replicas=2

Prevents data loss when combined with acks=all producers.

Enable unclean.leader.election.enable=false

Prevents data loss from out-of-sync replicas becoming leader.

Set up monitoring and alerting

Monitor broker health, under-replicated partitions, and disk usage.

Configure authentication and authorization

Enable SASL/SSL and ACLs for secure operations.

Plan storage capacity with 30% headroom

Account for retention, replication, and traffic growth.

Document runbooks for common operations

Broker restarts, partition reassignment, and emergency procedures.

Monitor Your Kafka Cluster with KLogic

KLogic provides comprehensive monitoring for your Kafka cluster configuration, helping you validate settings, detect configuration drift, and optimize performance.

Cluster health visualization
Configuration change tracking
Performance optimization recommendations
Partition balance monitoring
Try KLogic Free