Why a Single Node
topics runs on one machine, and that is the most consequential decision in the project. It isn’t a limitation we’re working around — it’s the choice that makes everything else possible: the small API, the millisecond latencies, the crisp answer to “what survives a crash.” Choose distribution instead and you inherit its complexity in every one of those places.
This page makes the case for the choice, and is honest about what it costs.
The SQLite model, for an event log
The useful comparison isn’t another message broker — it’s SQLite. SQLite became the most-deployed database in the world by making a deliberate trade: no server to run, no daemon, no cluster, no configuration — just a library and a file — in exchange for living on one machine. That trade fit an enormous fraction of real workloads, and it turned the database into something you embed and forget rather than operate.
topics takes the same trade for an append-only event log. The hard part — a correct, crash-safe, ordered log with millisecond delivery — lives inside the engine; the outside is a single process pointed at a single directory. The one difference from SQLite is deliberate: topics is a networked server, not an in-process library, so many services across your fleet can read and write the same log over HTTP. Think of it as one step out from embedded — SQLite’s operating model, with a network port.
So the mental model is “SQLite for events,” and the payoff is the same: you get the capability — here, a durable log doing millions of events a second — without signing up to run the distributed system that usually comes with it.
Scaling out is usually a reflex, not a requirement
The instinct, the moment a system starts to matter, is to spread it across nodes. But “more nodes” is rarely the answer to a question anyone has actually measured — it’s a habit, and an expensive one, because the cost lands on the part of the system you operate forever, not the part you write once.
topics takes the opposite default: use one machine well, and reach for more only once you can show you’ve outgrown it. For most event workloads, that day simply never arrives.
A single machine is bigger than you think
The mental picture of “one server” is stuck a decade in the past. A box you can rent by the hour today gives you, on a single predictable bill:
- a hundred or more cores,
- multiple terabytes of RAM, and
- local NVMe sustaining millions of IOPS at sub-millisecond latency.
That is more capacity than many distributed clusters had a few years ago. topics is built
to actually use it: the write-ahead log is sharded across cores
so durable throughput scales with the hardware, reads are O(1) seq-indexed lookups, and a
record delivered live over SSE is serialized once and shared
across every watcher. In-process the engine already appends and projects records at
millions a second, and delivery targets ~1 ms — numbers that come from staying on
one machine, not in spite of it. (The realistic end-to-end HTTP ceiling, lower and measured
honestly, is on the Performance page.)
What you are not paying for
The real case for one node isn’t the throughput — it’s everything a distributed log spends to reach its throughput that topics simply never spends:
- No coordination on the write path. No replication round-trips, no consensus, no partition or rebalance handling. A write is a sequential append to a local disk the engine fully owns. Every microsecond a cluster spends agreeing, topics spends appending — which is precisely why the latency floor is low enough to quote in milliseconds.
- No distributed failure modes. Split brain, partial failure, stale replicas, cross-node clock skew, “which node is authoritative” — none of these can occur when there is only one node. A whole category of 3 a.m. incidents is defined out of existence.
- One thing to run. A single process pointed at a single directory: one thing to deploy, monitor, secure, and back up — not a quorum to keep healthy or a membership protocol to babysit. The complexity you don’t take on is the complexity you never debug.
None of this is free — it’s a trade. You exchange horizontal scale and built-in high availability for latency, simplicity, and a system small enough to fully understand. The next section is the bill.
The tradeoff: availability
Be clear-eyed about the cost. A single node is a single point of failure for availability: it goes away for a few seconds during a deploy, a kernel patch, or the occasional crash. topics doesn’t pretend otherwise — it makes that downtime cheap and safe rather than engineering it away:
- Acknowledged work is never lost. A
diskorfsyncwrite lands in the write-ahead log before the ack, so a restart replays it instead of dropping it. See Durability. - Recovery is fast and gated. On start the engine loads its latest snapshot and replays
the log forward;
GET /v0/readystays503until an acknowledged durable write is fully recovered, then flips to200, so traffic never reaches a half-recovered server. See Recovery. - Shutdown is clean. On
SIGINT/SIGTERMthe server drains and checkpoints, so the next start has almost nothing left to replay.
The honest calculus: a few seconds of rare, mostly-planned downtime on a system you understand, against the permanent complexity and subtler failures of one that is “always up.” For most workloads that trade is worth making — and where it isn’t, you mitigate at the edges (a warm standby behind a proxy, frequent backups) instead of dragging a consensus protocol into your data path.
If you genuinely need more than one node
Don’t argue yourself out of a real requirement. If you need transparent failover, multi-region writes, or throughput beyond the largest single box, topics is not that system — multi-server, replication, HA, and single-writer fencing are out of scope by design, not unfinished features.
What you can build on top of independent nodes:
- Asynchronous mirroring with routers. Because a record’s origin node rides through every forward and loop-prevention drops a node’s own events, several nodes can mirror to each other without echo — handy for read fan-out and geographic locality. It buys you copies and proximity, not consistency or failover. See the multi-master guide.
- Partitioning by topic across machines at the application layer, when separate workloads don’t need to share a single log.
Router mirroring is asynchronous and at-least-once — it gives you copies and locality, never an availability or consistency guarantee. Don’t mistake it for HA.
Where to go next
The single-machine, sequential-disk assumptions and the layered HTTP → engine → WAL → segments shape.
Architecture OverviewThe throughput and latency targets, the measured baselines, and their honest caveats.
PerformanceHow the durable server starts, recovers, binds, and shuts down.
Running topicsWhy a restart never loses acknowledged work on a disk or fsync topic.