Skip to Content
Comparisonsvs Cloud Queues

vs Cloud Queues

AWS SQS, SNS, and Kinesis and GCP Pub/Sub are managed, multi-AZ-replicated, pay-per-use services that scale far beyond one machine and are operated for you. topics is the opposite trade: one self-hosted binary that puts a log, a queue, and fan-out behind a single JSON/HTTP API on your own NVMe, at fixed cost, with an explicit, never-silent loss signal — and no managed HA, no multi-AZ, no elastic scale-out.

These services are excellent at what they do. This page is about the trade — managed multi-AZ scale versus a single self-hosted binary with data locality and a delivery-time gap signal — not about any of them being weak. For most spiky, low-volume, or zero-ops workloads, the managed service is the right call.

The four services, briefly

Each occupies a different point in the design space; topics overlaps all four because it folds log, queue, and fan-out onto one substrate.

AWS SQS — lease queue, no replay

SQS is a managed queue with a visibility-timeout lease (default 30 s, max 12 h): ReceiveMessage hides a message for the timeout, DeleteMessage acks it. Standard queues are at-least-once with loose ordering; FIFO queues add ordering and exactly-once within a 5-minute dedup window. A native dead-letter queue kicks in after maxReceiveCount. There is no replay — an acked message is deleted, and retention is 4 days by default (up to 14). This maps closely to the topics lease queue: claim → lease, ack → permanent delete. The difference is that a topics queue sits on top of a retained log, so the underlying records can still be read by cursor; SQS has no log behind the queue.

AWS SNS — push fan-out, no storage

SNS is stateless push fan-out: publish once, deliver to many endpoints (SQS, Lambda, HTTP, email, SMS) with optional filter policies. There is no storage and no replay — it routes and forgets. The topics analog is a router fanning one topic into several topics (each with its own durability), plus SSE watch for live push. SNS wins decisively on heterogeneous delivery: topics pushes to SSE clients and other topics, not to email/SMS/Lambda.

AWS Kinesis — sharded log, with replay

Kinesis Data Streams is the closest cloud analog to a log: a sharded ordered log with per-shard sequence numbers, replay by sequence number or timestamp, and retention from 24 h to 365 days. Capacity is provisioned per shard (or on-demand). The shape — an ordered, seq-addressable, replayable log — is what topics is, with two differences: Kinesis scales horizontally by adding shards (ordering only within a shard), and topics replays by its own monotonic seq cursor on a single topic with no sharding.

GCP Pub/Sub — lease + ordering + seek replay

Pub/Sub combines a lease and a log: an ack-deadline lease (max 600 s) with modifyAckDeadline, opt-in exactly-once delivery and ordering keys, a native DLQ, and retention up to 31 days with seek to a timestamp or snapshot for replay. It is the most feature-complete of the four — lease semantics and time-based replay and ordering.

The shared axis: managed multi-AZ vs one self-hosted binary

All four services share a profile that is the recurring “cloud wins” axis:

  • Managed and zero-ops — no servers to run, patch, or back up.
  • Multi-AZ / regional replicated durability with an SLA — they survive a datacenter failure.
  • Pay-per-use with free tiers, and elastic scale far beyond one machine.
  • Deep cloud-native integration — IAM, KMS, and the provider’s observability stack.

topics is the inverse on every one of those points:

  • You operate it — one binary, one disk; you handle backup, upgrades, and security.
  • Single machine is the entire durability and failure domain. No replication, no multi-AZ, no failover. An fsync topic survives any crash of that machine and replays its WAL on restart, but if the machine or disk is gone, you restore from backup. See durability.
  • Vertical scale only — millions of records/sec in the engine core, ~0.5 M rec/sec at a single HTTP origin disk-class, but you scale up, not out. See Performance.
  • No native TLS — run behind a TLS-terminating reverse proxy; auth is bearer keys, hashed at rest, with per-key scopes and a prefix allowlist.

If you need multi-AZ durability that survives a datacenter outage with an SLA, you need a managed (or self-built replicated) system. topics is one machine — that is the deal it makes in exchange for being one small binary.

The no-gap-signal contrast

This is the sharpest functional difference, and it is true of all four cloud services: none expose a per-consumer in-band gap signal. When messages age out of retention or a shard trims data a consumer hasn’t reached, the loss is silent. You detect it from a sequence-number jump (Kinesis), a lag/age metric, or not at all — after the fact.

topics refuses to lose data quietly. If cap eviction or TTL expiry removed records below a consumer’s cursor, the read returns an in-band tombstone with the exact [gap_from, gap_to] range — at HTTP 200, on the same response that carries the surviving records. Loss you asked for (a point-in-time delete) is filtered silently; loss you didn’t is always announced. That asymmetry is the load-bearing invariant of the system, and no cloud queue provides it.

The cloud services do offer something topics cannot: FIFO/exactly-once (SQS FIFO, Pub/Sub exactly-once + ordering keys) and time-based replay (Kinesis, Pub/Sub seek). topics is at-least-once only and replays by seq cursor, not by wall-clock time.

Fixed cost and data locality

The other topics pull is economics and location:

  • Fixed cost on your own NVMe — no per-request or per-GB metering, no egress charges, no vendor lock-in. A high, steady, predictable volume that would meter expensively in the cloud runs at the flat cost of the machine it’s on.
  • Data locality — sub-millisecond local delivery (disk-class write-ack p50 around 0.06 ms on the reference machine; SSE write→deliver in the 1–5 ms range), and the data physically stays where you put it: on-prem, at the edge, or fully air-gapped. No data leaves your machine.
  • Surgical deletes — remove records by seq range and/or tag (exact or prefix), permanently and point-in-time, via the delete API. Cloud queues let you ack/delete individual messages but have nothing like a tag-range purge across retained history.

The flip side: there’s no free tier and no pay-per-use. For spiky or low-volume workloads, a managed service you pay for by the request is almost always cheaper and simpler than a machine you run 24/7.

Verdict

topics wins when

  • You want log + queue + fan-out in one self-hosted JSON/HTTP API, not three managed services stitched together.
  • You need an explicit, never-silent loss signal — a tombstone with the exact [gap_from, gap_to] range at delivery time.
  • You want fixed cost on your own NVMe — no per-request/GB metering, no egress, no lock-in.
  • You need data locality — sub-ms local delivery and on-prem, edge, or air-gapped operation where data never leaves the machine.
  • You want surgical tag/range deletes across retained history.
  • You want curl as a complete client, no SDK or cloud IAM to wire up.

Cloud queues win when

  • You want zero-ops, managed multi-AZ HA with an SLA that survives a datacenter failure.
  • You need elastic horizontal scale far beyond one machine.
  • You need exactly-once, FIFO, or ordering keys (SQS FIFO, Pub/Sub).
  • You need time-based replay — seek to a timestamp or snapshot (Kinesis, Pub/Sub).
  • You need heterogeneous push fan-out to SQS/Lambda/HTTP/email/SMS (SNS).
  • You want deep IAM/KMS/observability integration and pay-per-use with a free tier for spiky or low volume.

See also

Last updated on