Skip to Content
Core GuaranteesDurability Classes

Durability Classes

Durability in topics is a per-topic decision, not a server-wide mode. Each topic picks one of four commit classesephemeral, memory, disk, or fsync — and they form a weak → strong spectrum that trades write latency against crash survival.

Where “ok” lands

Every write either stays resident-only in RAM or travels the disk path — from your process, to the kernel, to the physical disk. The classes differ in where the record lands and how far topics waits before it acks:

  • ephemeral publishes resident-only records and skips the WAL / segment path. The topic config persists, records are intentionally gone after restart, and seqs remain monotonic.
  • memory uses the WAL path but makes no durability promise. Fastest disk-like class; after a restart its records may survive or be gone.
  • disk acks once the write is in the group-committed write-ahead log and on its way to the kernel. It survives a process crash, minus the most recent un-fsynced tail. The default.
  • fsync waits for the data to be fsynced to the platter before acking, so it survives any crash, including power loss.

Pick the weakest class a topic can tolerate and it pays the least latency; reach for fsync only where an acknowledgment must be a promise you can never take back. Because the choice is per topic, a throwaway cache (memory), a pub/sub feed (disk), and a financial ledger (fsync) coexist with a RAM-only live feed (ephemeral) in one process without taxing each other.

The class is resolved from the topic’s current config (durability_class()) and reported on every topic-state and topic-create response. The topic type is immutable, but durability/config can be updated in place — the resolved class always reflects the current config.

The four classes

The defining promise across all four: an acked write is published under the class you selected, and topics never claims a stronger crash guarantee than that class provides. A write that fails to commit publishes nothing. What differs between classes is where the record lands, when the ack fires, and what a crash costs.

ephemeral

Resident-only records. A durability:"ephemeral" topic publishes from RAM and intentionally skips the record WAL, snapshots, and HOT segments. It is fully queryable while the process is running, including getState, getDifference, SSE, and router destinations, but records are empty after restart by design.

The topic config always persists as a control frame, and checkpoints preserve the published head without payloads. That means post-restart writes keep moving forward and do not reuse seqs, even though the old resident records are gone. The ack is never fsync-gated, so fsync_ms is 0.

Reach for ephemeral for RAM-only live fan-out where the wire contract and monotonic seqs matter, but replay across restart does not. It is reachable only by setting durability:"ephemeral" explicitly.

memory

“disk-like but best-effort.” A memory topic takes the same group-committed WAL write and recovery path as disk and is fully queryable (getState / getDifference / SSE) — but it carries NO durability guarantee. The ack is never fsync-gated, so fsync_ms is 0 (the fastest path).

After a restart its records MAY survive OR be lost — recovery is gradual / best-effort: it does not block readiness and does not guarantee completeness or emptiness. The topic config always persists (it is a control frame in the WAL). The one hard bound is no-fabrication / no-future-seq: a recovered record is always one that was actually written, and the head never hands out a seq past what was acked (a best-effort restart may legitimately regress the head if the un-fsynced tail was lost — there is no durable seq reservation).

Effectively disk minus the durability promise. Reach for memory for caches and scratch state where occasional loss is an acceptable trade for the lowest disk-like latency. It is reachable only by setting durability:"memory" explicitly.

disk

Records are written to the WAL and group-committed — no per-write fsync — so fsync_ms is 0 (the fast path). The write is acked as soon as its frame is enqueued to its topic’s WAL-shard writer (the WAL is sharded); the ack is not fsync-gated. The shard writer then group-commits and fdatasyncs the batch shortly after.

A crash loses the un-fsynced tail — the frames that were enqueued but not yet group-fsynced. Everything older is recovered by WAL replay. This is the default and the pub/sub workhorse: durable enough that a clean shutdown loses nothing, fast enough that the common feed isn’t paying for a guarantee it doesn’t want. disk is today’s durable:false.

fsync

The ack is fsync-gated: it is held until the WAL frame is durably synced, so the response carries a real fsync_ms. The write survives any crash — an acked write is always recovered by WAL replay.

This is the class for job queues, financial events, and anything where an acknowledgment is a promise you can never take back. It is today’s durable:true. The per-write cost is the disk’s fsync latency, but adaptive group commit amortizes one fsync across a whole batch of concurrent durable writers, so the per-event cost approaches a sequential disk append rather than one fsync per record. See WAL & group commit.

fsync durability is bounded by the disk’s fdatasync latency. On a laptop’s APFS NVMe that floor is roughly 5 ms; server-grade NVMe is typically 50–500 µs (about 10× faster). This is a hardware property, not a design cost — group commit hides it under concurrency, but a single serial durable writer pays the floor per write. See Performance.

Comparison

ClassWAL?Ack firesSurvives a crash?fsync_ms
ephemeralrecords: no; config/control frames: yesimmediately, not fsync-gatedrecords: no; config persists and seqs do not reuse after checkpointed heads0
memoryyes, group-committed (same path as disk)immediately, not fsync-gatedbest-effort — records MAY survive OR be lost (no guarantee); config always persists0
diskyes, group-committed (no per-write fsync)on WAL-frame enqueue, not fsync-gatedyes, minus the un-fsynced tail0
fsyncyes, fsync-gatedafter the group fsyncyes, any crash — an acked write is always recoveredreal value

Choosing and configuring a class

Set the class with the durability field at create time:

# An fsync-gated ledger: every ack is a promise the write survives power loss. curl -X PUT $TOPICS/v0/topics/payments \ -H 'content-type: application/json' \ -d '{ "durability": "fsync", "cap_records": 0, "ttl_ms": 0 }'
{ "topic": "payments", "created": true, "config": { "ttl_ms": 0, "cap_records": 0, "cap_bytes": 0, "discard": "old", "durable": true, "durability": "fsync", "auto_create": true, "idempotency_window_ms": 120000, "dedupe_node": true }, "performance": { "server_total_ms": 0.22 } }

The class defaults to disk when neither durability nor durable is set, so an auto-created topic (one materialized lazily on first write) is disk. To get ephemeral or memory, you must say so explicitly.

The durable bool

durable is a shorthand alias for the common disk / fsync cases:

  • durable: truefsync
  • durable: falsedisk
  • ephemeral is reachable only via durability:"ephemeral"
  • memory is reachable only via durability:"memory"

An explicit durability always wins over durable. On the way out, durable is normalized to durable == (durability == "fsync"), so the boolean reports whether the topic is fsync-gated regardless of how the topic was configured. Internally, is_durable() is simply class == "fsync".

Durability only governs persistence across a crash or restart. It is independent of retention: ttl_ms and the cap_records/cap_bytes caps still apply to every class, and involuntary eviction or expiry of live records always surfaces as a tombstone — even on ephemeral and memory topics, for as long as those topics live.

Routers and dead-letter honor the destination class

Router forwarding is async (off the source write/ack path) and derived — the forwarded copy is not separately WAL-logged, so one source append is one WAL write regardless of fan-out, and copies are re-derived on recovery by replaying from a durable per-router cursor. The destination topic’s commit class governs how/whether that re-derived copy (and a dead-lettered job) is retained and recovered — not the source’s:

  • A fsync destination retains/recovers the copy under fsync semantics.
  • A disk destination retains it group-committed, recovered minus the un-fsynced tail.
  • A memory destination keeps a best-effort copy — may survive or be lost on a restart (the dest config always persists), exactly like a direct write to that topic.
  • An ephemeral destination keeps the copy resident-only while the process is running and loses it on restart by design.

So if you route an fsync source into a memory or ephemeral destination, the forwarded copies follow that weaker destination contract while the source recovers in full. The durability of the copy follows where it lands, not where it came from. Size and class your destinations deliberately. See Routers and the Multi-master guide.

Router forwarding is at-least-once via the durable per-router cursor. A persistently-failing or full discard:"reject" destination is held as backpressure (the record stays available in source, the cursor does not advance), so a stuck destination lags until it recovers. Queue lease durability is best-effort (leases_durable defaults false); a transient WAL error on a lease append degrades to the queue’s baseline at-least-once rather than losing or duplicating work. See the maturity notes in the Introduction.

How a crash actually recovers

On restart the engine loads the latest snapshot, replays the WAL forward from the snapshot’s checkpoint, truncates a torn tail (any frame whose length runs past EOF or whose XXH3-64 checksum fails to verify), and reclaims orphaned segments. An acked durable write is, by construction, a complete, checksum-valid WAL frame — so it is never lost. The only data a crash can cost is what a class explicitly does not promise: an ephemeral topic’s resident records, a memory topic’s records, or a disk topic’s un-fsynced tail. Those losses surface to consumers as ordinary eviction-style gaps. See Recovery.

See also

  • Tombstones — how lost data (including a crash-dropped tail) appears to a consumer.
  • WAL & group commit — how disk and fsync writes are committed off-lock.
  • Recovery — snapshot load, WAL replay, and torn-tail truncation.
  • Configure a topic — the full config field table including durability.
Last updated on