Skip to Content
Core GuaranteesOrdering & Cursors

Ordering & Cursors

Every record gets a per-topic seq: a u64 that is strictly increasing and gap-free at assignment, assigned at durable commit. The seq is the cursor — there is no opaque token. Advancing your stored cursor past a record acks it. This page covers exactly what the seq contract guarantees (and what it does not), the cursor model, the next_from_seq / caught_up semantics, and how seqs behave across restart and topic recreate.

Seq assignment

Each topic has its own u64 counter, next_seq, starting at seq_base (default 1; 0 is reserved to mean “no records”). On commit of a write of N records, the server atomically assigns next_seq … next_seq + N − 1, advances next_seq by N, and returns the seqs in write order.

Three facts hold at assignment time:

  • Strictly increasing. Each assigned seq is larger than the last. There is no secondary sort; ascending seq is the canonical order.
  • Gap-free. The assigned sequence has no holes — next_seq, next_seq+1, … with no skips.
  • Atomic per write. A single write request of N records either commits all N with contiguous seqs, or none. There is no partial append.

Assignment happens at commit, after WAL ordering — so seq order equals durable commit order equals delivery order. A record’s seq, like the record itself, never changes once assigned.

Seq is rendered as a JSON number. It fits in an IEEE-754 double until ~9 quadrillion, well beyond any single topic’s lifetime, so no string encoding is needed. Clients should still parse it as a 64-bit integer.

”Gap-free at assignment” vs “holes are normal”

This is the distinction that confuses people, so it is stated precisely. Assignment is gap-free. Visibility is not.

After eviction, TTL expiry, deletion, or node-filtering, the seqs a consumer observes in the retained window can have holes — 4097, 4098, 4101, …. The underlying counter never skips; what a reader sees does. A consumer:

  • MUST NOT assume received seqs are contiguous.
  • MAY assume received seqs are strictly increasing.
  • MAY assume any missing seq below head_seq was either lost involuntarily (cap/TTL — a tombstone fires if it crossed the cursor) or removed voluntarily (deletion or node-filtering — silently skipped).

The split between “you missed data” (involuntary → tombstone) and “data was intentionally removed for you” (voluntary → silent) is the core safety property of topics. A visibility hole is never ambiguous: it is one or the other, and the dual watermark keeps them structurally separate.

The cursor model — seq is the cursor

A cursor is a plain seq, interpreted as an exclusive lower bound: a read returns records with $seq > from_seq.

  • from_seq = 0 means “from the beginning of what is currently retained” (earliest_seq).
  • A tail / only-new cursor is from_seq = head_seq at subscription — like Redis $. You can read this off topic state as next_seq - 1.

There is no opaque continuation token on topic reads. The client owns its cursor; on the diff path the server keeps no per-consumer state at all. You store the cursor, you pass it back, you advance it.

Cursor-advance is ack-all

The default consume model is cursor-advance = ack-all (the Kafka offset / NATS AckAll model): advancing your stored from_seq past seq N acks records 1..N. There is no per-message ack on this path — moving the cursor is the acknowledgment.

This is why a delete or a node-filter must still advance the cursor past the records it skips: otherwise a consumer reading a topic full of its own (filtered) events would loop forever, never able to ack past them.

Per-message explicit ack with leases and heartbeats is a separate primitive — it lives in the lease-based queue (claim / ack / nack / extend), layered on the same log. The plain diff cursor is deliberately the stateless-log primitive underneath it.

next_from_seq and caught_up

A diff read returns a continuation cursor and a “done for now” flag. They mean exactly:

FieldMeaning
next_from_seqPass this back as from_seq. It equals the $seq of the last examined record — filtered/deleted/expired records still advance it — so skipped records are never re-scanned.
caught_uptrue when next_from_seq == head_seq. The reliable “no more right now” signal.
head_seqThe log end (highest assigned seq).
earliest_seqThe retained floor (first currently-live seq).
laghead_seq - next_from_seq — records still behind your cursor.

The critical rule:

caught_up — not records.length — is the “no more” signal. Because node-filtered, deleted, and TTL-expired records are omitted from records while still advancing next_from_seq, a response can have records.length == 0 (or fewer than limit) while the cursor advanced past many seqs. A consumer that loops “until the batch comes back empty” can spin or stall. Loop until caught_up is true.

A worked example: you read a topic where every record was written by your own node and you pass your node id as the filter. Every record is dropped, records is [], but next_from_seq jumps to head_seq and caught_up is true in one call. You are caught up — the empty batch was correct, not a stall.

{ "topic": "orders", "records": [], "next_from_seq": 480234, "head_seq": 480234, "earliest_seq": 479101, "caught_up": true, "tombstone": null, "lag": 0, "performance": { "server_total_ms": 0.21, "records_scanned": 1130 } }

Restart and recreate

Seqs are stable across the events that reset state, in well-defined ways.

Restart

After a restart, next_seq is recovered as max(committed seq) + 1. Records that were buffered in the WAL but not yet durably committed — possible only on non-durable (disk/memory) topics — are lost on a crash; their seqs are never reused, so seq stays monotonic across restarts. A gap left by a lost-but-acked non-durable write looks to a consumer exactly like eviction: a hole that, if it crosses the cursor below evict_floor, tombstones. See Durability Classes.

A memory topic is best-effort: it takes the same group-committed WAL + recovery path as disk, so on restart its records may survive or be lost (no guarantee either way). It never resets to empty by contract and head_seq never regresses above the acked head — at worst a lost tail leaves head_seq at a lower (but still monotonic, never-rewound-below-acked) value; the topic config always persists. See Durability Classes.

Delete + recreate (seq rewind)

If a topic is deleted and a new topic of the same name is created, the new topic restarts next_seq at seq_base. A stale consumer presenting a from_seq from the future relative to the new topic (i.e. from_seq >= new head_seq) receives a tombstone with reason: "recreated" — never silent corruption. The server detects the rewind via a per-topic-instance epoch (bumped on create); absent the epoch it treats from_seq > head_seq as the recreate signal. The read then proceeds from the new earliest_seq.

This is the one case where earliest_seq resets downward; over a single topic instance’s life it is otherwise monotonically non-decreasing.

See also

Last updated on