vs Cloud Queues
AWS SQS, SNS, and Kinesis and GCP Pub/Sub are managed, multi-AZ-replicated, pay-per-use services that scale far beyond one machine and are operated for you. topics is the opposite trade: one self-hosted binary that puts a log, a queue, and fan-out behind a single JSON/HTTP API on your own NVMe, at fixed cost, with an explicit, never-silent loss signal — and no managed HA, no multi-AZ, no elastic scale-out.
These services are excellent at what they do. This page is about the trade — managed multi-AZ scale versus a single self-hosted binary with data locality and a delivery-time gap signal — not about any of them being weak. For most spiky, low-volume, or zero-ops workloads, the managed service is the right call.
The four services, briefly
Each occupies a different point in the design space; topics overlaps all four because it folds log, queue, and fan-out onto one substrate.
AWS SQS — lease queue, no replay
SQS is a managed queue with a visibility-timeout lease (default 30 s, max 12 h):
ReceiveMessage hides a message for the timeout, DeleteMessage acks it. Standard
queues are at-least-once with loose ordering; FIFO queues add ordering and
exactly-once within a 5-minute dedup window. A native dead-letter queue kicks in
after maxReceiveCount. There is no replay — an acked message is deleted, and
retention is 4 days by default (up to 14). This maps closely to the topics
lease queue: claim → lease, ack → permanent delete. The difference is
that a topics queue sits on top of a retained log, so the underlying records can still be
read by cursor; SQS has no log behind the queue.
AWS SNS — push fan-out, no storage
SNS is stateless push fan-out: publish once, deliver to many endpoints (SQS, Lambda, HTTP, email, SMS) with optional filter policies. There is no storage and no replay — it routes and forgets. The topics analog is a router fanning one topic into several topics (each with its own durability), plus SSE watch for live push. SNS wins decisively on heterogeneous delivery: topics pushes to SSE clients and other topics, not to email/SMS/Lambda.
AWS Kinesis — sharded log, with replay
Kinesis Data Streams is the closest cloud analog to a log: a sharded ordered log with
per-shard sequence numbers, replay by sequence number or timestamp, and retention from
24 h to 365 days. Capacity is provisioned per shard (or on-demand). The shape — an ordered,
seq-addressable, replayable log — is what topics is, with two differences: Kinesis scales
horizontally by adding shards (ordering only within a shard), and topics replays by
its own monotonic seq cursor on a single topic with no sharding.
GCP Pub/Sub — lease + ordering + seek replay
Pub/Sub combines a lease and a log: an ack-deadline lease (max 600 s) with
modifyAckDeadline, opt-in exactly-once delivery and ordering keys, a native DLQ,
and retention up to 31 days with seek to a timestamp or snapshot for replay. It is the
most feature-complete of the four — lease semantics and time-based replay and ordering.
The shared axis: managed multi-AZ vs one self-hosted binary
All four services share a profile that is the recurring “cloud wins” axis:
- Managed and zero-ops — no servers to run, patch, or back up.
- Multi-AZ / regional replicated durability with an SLA — they survive a datacenter failure.
- Pay-per-use with free tiers, and elastic scale far beyond one machine.
- Deep cloud-native integration — IAM, KMS, and the provider’s observability stack.
topics is the inverse on every one of those points:
- You operate it — one binary, one disk; you handle backup, upgrades, and security.
- Single machine is the entire durability and failure domain. No replication, no
multi-AZ, no failover. An
fsynctopic survives any crash of that machine and replays its WAL on restart, but if the machine or disk is gone, you restore from backup. See durability. - Vertical scale only — millions of records/sec in the engine core, ~0.5 M rec/sec at a single HTTP origin disk-class, but you scale up, not out. See Performance.
- No native TLS — run behind a TLS-terminating reverse proxy; auth is bearer keys, hashed at rest, with per-key scopes and a prefix allowlist.
If you need multi-AZ durability that survives a datacenter outage with an SLA, you need a managed (or self-built replicated) system. topics is one machine — that is the deal it makes in exchange for being one small binary.
The no-gap-signal contrast
This is the sharpest functional difference, and it is true of all four cloud services: none expose a per-consumer in-band gap signal. When messages age out of retention or a shard trims data a consumer hasn’t reached, the loss is silent. You detect it from a sequence-number jump (Kinesis), a lag/age metric, or not at all — after the fact.
topics refuses to lose data quietly. If cap eviction or TTL expiry removed records below a
consumer’s cursor, the read returns an in-band tombstone with the
exact [gap_from, gap_to] range — at HTTP 200, on the same response that carries the
surviving records. Loss you asked for (a point-in-time delete) is
filtered silently; loss you didn’t is always announced. That asymmetry is the
load-bearing invariant of the system, and no cloud queue provides it.
The cloud services do offer something topics cannot: FIFO/exactly-once (SQS FIFO,
Pub/Sub exactly-once + ordering keys) and time-based replay (Kinesis, Pub/Sub seek).
topics is at-least-once only and replays by seq cursor, not by wall-clock time.
Fixed cost and data locality
The other topics pull is economics and location:
- Fixed cost on your own NVMe — no per-request or per-GB metering, no egress charges, no vendor lock-in. A high, steady, predictable volume that would meter expensively in the cloud runs at the flat cost of the machine it’s on.
- Data locality — sub-millisecond local delivery (disk-class write-ack p50 around 0.06 ms on the reference machine; SSE write→deliver in the 1–5 ms range), and the data physically stays where you put it: on-prem, at the edge, or fully air-gapped. No data leaves your machine.
- Surgical deletes — remove records by
seqrange and/or tag (exact or prefix), permanently and point-in-time, via the delete API. Cloud queues let you ack/delete individual messages but have nothing like a tag-range purge across retained history.
The flip side: there’s no free tier and no pay-per-use. For spiky or low-volume workloads, a managed service you pay for by the request is almost always cheaper and simpler than a machine you run 24/7.
Verdict
topics wins when
- You want log + queue + fan-out in one self-hosted JSON/HTTP API, not three managed services stitched together.
- You need an explicit, never-silent loss signal — a tombstone
with the exact
[gap_from, gap_to]range at delivery time. - You want fixed cost on your own NVMe — no per-request/GB metering, no egress, no lock-in.
- You need data locality — sub-ms local delivery and on-prem, edge, or air-gapped operation where data never leaves the machine.
- You want surgical tag/range deletes across retained history.
- You want
curlas a complete client, no SDK or cloud IAM to wire up.
Cloud queues win when
- You want zero-ops, managed multi-AZ HA with an SLA that survives a datacenter failure.
- You need elastic horizontal scale far beyond one machine.
- You need exactly-once, FIFO, or ordering keys (SQS FIFO, Pub/Sub).
- You need time-based replay — seek to a timestamp or snapshot (Kinesis, Pub/Sub).
- You need heterogeneous push fan-out to SQS/Lambda/HTTP/email/SMS (SNS).
- You want deep IAM/KMS/observability integration and pay-per-use with a free tier for spiky or low volume.
See also
- Comparisons overview — the full capability matrix across all four systems.
- vs NATS JetStream — the closest single-binary peer.
- Tombstones — the explicit gap signal in depth.
- Job Queue guide — building a lease queue on topics.