Mimori
distributed kv store written in go from scratch on raft. strong consistency, follower reads, dynamic membership, full observability stack.
Mimori is a distributed key-value store I built in Go to actually understand Raft. Not "wrap an existing library and call it learning" — the consensus protocol, persistence, membership transitions, all implemented by hand against the paper.
Related writing: raft: how distributed systems actually agree.
why build this
Distributed systems blogs make Raft sound clean. Implementing it teaches you how many edge cases the diagrams quietly skip — split votes, stale leaders, log entries committed under one term and replicated under another, snapshots that arrive before the receiver has caught up. The only way to internalise the algorithm is to write the parts that hurt.
what works
- Raft from scratch: leader election (under 500ms), log replication, snapshotting
- Strong consistency for writes through the leader, with follower reads for scale — bounded staleness around 300ms, roughly 72× the throughput of going through Raft
- Dynamic membership: add and remove nodes without downtime
- Leader transfer for graceful maintenance
- Pebble (LSM) as the on-disk storage engine
- gRPC for all cluster RPCs, with health and readiness endpoints
interfaces
A Go client library, a CLI (mimorictl) with leader discovery and auto-retry, an embedded web dashboard, a REST + JSON API, and direct gRPC access. Pick whatever fits.
observability
Every node exposes Prometheus metrics — term, commit index, applied index, RPC latencies. The repo ships a Docker Compose stack with Prometheus + Grafana so you can actually watch the cluster: see leader churn, replication lag, the moment a partition heals.
numbers
Measured on a 3-node cluster running on Docker (M1 MacBook Pro, 8-core, 16GB):
- ~32 ops/sec writes through Raft consensus
- ~2,310 ops/sec stale reads from followers
- p95 latency: 253ms write, 10ms read
- 24-hour stress tests, plus chaos tests covering partitions and node crashes
Good enough for what it's actually for: configuration stores, metadata services, coordination tasks, learning. Not built to replace etcd in a real production deployment — and the README is honest about that.