Skip to Content
Comparisonsvs Apache Kafka

vs Apache Kafka

Apache Kafka is a distributed commit-log cluster built for horizontal scale, replicated HA, and exactly-once processing. topics is a single-machine server built for operational simplicity and an explicit in-band loss signal. The one-line framing: SQLite is to a database cluster what topics is to Kafka — much the same job, done with the operational footprint of one process and a directory instead of a fleet to run. They solve different scale problems; this page compares them dimension by dimension and is honest about where Kafka is the right tool.

The word “tombstone” means the opposite thing in each system. In Kafka a tombstone is a null-value record on a compacted topic that tells compaction to drop a key. In topics a tombstone is a consumer-facing “you missed records [gap_from, gap_to]” signal. See Tombstones — opposite meanings below; do not confuse them.

Partitions & offsets

Kafka organizes data as topics divided into partitions. Each partition is an ordered, immutable log with its own 64-bit offset, and ordering is guaranteed only within a partition — across partitions there is no global order. Consumer groups commit offsets to track progress, and partition count is the unit of both parallelism and ordering.

topics has no partitions. Each topic is a single ordered log with a monotonic $seq (a u64), and that $seq is the cursor: you read with POST /v0/topics/:topic/diff from a from_seq and advance your stored cursor to ack — there is no separate offset-commit protocol. Ordering is total within a topic. The trade is direct: Kafka buys parallelism and capacity by adding partitions (at the cost of cross-partition ordering); topics gives you total per-topic order but scales only vertically.

Durability: replication vs per-topic fsync

Kafka’s durability comes from replication, not per-write disk sync. A topic has a replication.factor; producers choose acks=0|1|all; and min.insync.replicas sets how many replicas must hold a record before an acks=all write is acknowledged. Kafka leans on the OS page cache plus replication rather than fsync-ing every write — durability is a function of how many machines hold the data, so a single broker can fail without data loss.

topics’ durability is per-topic and single-machine, expressed as a durability class:

  • ephemeral — resident-only records: queryable while the process runs, durable config, monotonic seqs, and intentionally empty after restart.
  • memory — same group-committed WAL path as disk, but best-effort: not fsync-gated and no durability guarantee (records may survive or be lost on restart).
  • disk — WAL with adaptive group commit; ack on enqueue, survives a crash minus the un-fsynced tail.
  • fsync — ack is fsync-gated; an acknowledged write is committed and recovered by WAL replay after any crash.

This is the headline trade in Kafka’s favor: Kafka’s replicated durability tolerates the loss of a whole machine; a topics fsync write tolerates a crash of its host machine but not the machine’s destruction. topics has no replication, no HA, no failover — one NVMe-backed machine is the entire durability and failure domain. If you need machine-loss tolerance, use Kafka.

Tombstones: opposite meanings

Both systems use the word “tombstone” — for opposite purposes. Getting this wrong leads to real confusion, so it is worth stating plainly.

Kafka tombstonetopics tombstone
What it isA record with a null value on a log-compacted topicAn in-band marker in a diff / SSE response
Who it is forThe compaction processThe consumer
What it means”Drop this key from the compacted log” (data removal)“You missed records [gap_from, gap_to]” (loss notification)
EffectEventually deletes the latest value for that keyNotifies; the data is already gone

In Kafka, log compaction retains the latest value per key and uses null-value records to instruct removal of a key — it is a retention/cleanup mechanism. Kafka has no in-band gap-detection signal: when a consumer’s committed offset falls outside the retained range, its behavior is governed by auto.offset.reset (jump to earliest or latest), and the fact that records were skipped is not surfaced as data.

In topics, a tombstone is exactly that gap-detection signal. When involuntary cap eviction or TTL expiry destroys records below a reader’s cursor, the next read returns tombstone: { gap_from, gap_to, reason } at HTTP 200. It is a notification to the consumer, not a retention directive.

Gap detection

Following from the above: topics’ core differentiator over Kafka is the explicit, delivery-time gap signal. Kafka leaves “did I miss anything?” to offset arithmetic and lag metrics; a consumer that lagged past retention silently resumes from wherever auto.offset.reset points. topics emits the exact missed range in band, and distinguishes involuntary loss (tombstoned) from voluntary removal (silently filtered) so the signal stays trustworthy.

Delivery & exactly-once

Kafka is at-least-once by default and supports exactly-once semantics via the idempotent producer, transactions, and read_committed consumers — a genuine, production-grade capability for end-to-end exactly-once pipelines.

topics has no transactions and no Kafka-style end-to-end exactly-once pipeline. It provides idempotency on the write path, node loop-prevention, and an opt-in router guarantee:"exactly_once" that suppresses duplicate derived destination appends by stable router idempotency key. Consumers still need idempotency for external side effects. If your pipeline requires transactional exactly-once processing across stages, Kafka offers it and topics does not.

Queues & share groups

Kafka’s classic consumer groups distribute partitions across consumers, which means parallelism is capped at the partition count and there is no per-message acknowledgment — the unit of progress is the offset, not the individual message. KIP-932 share groups add per-message ack and redelivery (queue-like semantics), but as of Kafka 4.1 they remain a non-production preview.

topics ships production lease queues today: claim returns individual jobs with a lease_id and deadline; ack permanently removes a job (the ack is the delete); nack and extend control redelivery and visibility; and parallelism is bounded by your workers, not by a partition count. A built-in dead-letter move relocates a job to a configured topic after max_deliveries rather than redelivering it forever.

Ops footprint

Kafka is a multi-node JVM cluster. KRaft (Kafka 4.0+) removes the ZooKeeper dependency by moving metadata into Kafka itself, which simplifies operations — but it is still a cluster of JVM brokers to provision, balance, and monitor.

topics is one self-contained server and one data directory. No JVM, no cluster bootstrap, no quorum, no partition rebalancing. You run it, point it at a data dir, and talk to it over HTTP. Graceful shutdown writes a final snapshot; a readiness probe (/v0/ready) gates traffic during WAL replay. The simplicity is the point — and the cost is that there is no cluster to scale or fail over to.

Scale

Kafka scales horizontally to very high volumes — add brokers and partitions; large clusters sustain 15M+ messages/sec. topics scales vertically only: its ~1M events/sec figure is an aggregate design target reached by batching and sharding across many topics and connections on one NVMe-backed machine, and a single HTTP origin tops out around 0.5M records/sec disk-class. See Performance for measured numbers. If you need throughput beyond one machine, Kafka is built for it and topics is not.

Ecosystem

Kafka’s ecosystem is enormous: Kafka Connect (source/sink connectors), Kafka Streams and ksqlDB (stream processing), Schema Registry, MirrorMaker 2 (cross-cluster replication), pervasive security (TLS, SASL, per-topic ACLs, quotas), and mature managed services (Confluent Cloud, Amazon MSK).

topics has no comparable ecosystem and no managed offering — you operate, back up, and secure it yourself, behind a TLS-terminating reverse proxy. What it offers instead is built in: routers with node loop-prevention for fan-out, multiplexed SSE, and lease queues, all on one substrate reachable with curl.

Verdict

topics wins when

  • You want plain HTTP/JSON, no JVM and no cluster, with curl as a client.
  • You need an explicit, in-band gap signal at delivery time, not silent offset resets.
  • You want production lease queues today (per-message ack, dead-letter move) without KIP-932 preview caveats.
  • You need targeted, permanent deletes by seq range or tag, not just key compaction.
  • You want per-topic durability classes and built-in routers with loop-prevention on one substrate.

Kafka wins when

  • You need horizontal scale and throughput beyond a single machine.
  • You need replicated HA that tolerates the loss of whole brokers.
  • You need exactly-once processing via transactions and read_committed.
  • You want partition-level parallelism and the deep ecosystem (Connect, Streams, ksqlDB, Schema Registry).
  • You want a managed service (Confluent Cloud, MSK) and log compaction (latest value per key).

See also

Last updated on