ADR-0004 — ClickHouse as both audit and analytical store¶
- Status: Accepted
- Date: 2026-04-28
Context¶
The platform has two big-data shapes:
- Analytical —
tracks,frames,detections. High-volume, append-mostly, queried by org/time/region/H3-cell. Used by Planner web (coverage, search) and Gen (federated dashboards). - Audit — every administrative action recorded with org context for compliance. High-volume, append-only, queried by
(org_id, created_at)+ filters.
We could:
- Postgres for both. Already there. But the analytical workload — millions of frames per org, geospatial filters, time-range scans — is not Postgres' shape. We've seen this fail at smaller scales.
- Postgres for audit, ClickHouse for analytics. Two stores, two schemas, two retention models, two backup stories.
- One OLAP engine for both. Add ClickHouse, use it for both analytics and audit.
Decision¶
Use ClickHouse 26.3 (Altinity Operator-managed, 1 shard × 2 replicas) for both analytical (tracks, frames, detections) and audit (audit_log) workloads.
Consequences¶
- Positive: One OLAP engine to operate, one cluster to back up, one set of skills to hire for.
- Positive: Audit queries (filter by org, by time range) are exactly ClickHouse's sweet spot. Postgres would have been workable but expensive at growth.
- Positive: Native partitioning by
toYYYYMM(...)lets us drop old data wholesale, noDELETEstorms. - Positive: Analytical engines handle the "scan recent frames in this region" workload without bespoke indexes.
- Negative: ClickHouse is eventually consistent within a replica set (merges happen in background). Idempotency on the analytical tables is enforced client-side via
DELETE+INSERT(see below) rather than at the engine level. - Negative: ClickHouse is operationally different from Postgres. We rely on the Altinity operator and on SigNoz dashboards for visibility.
- Negative: Two stores total in the platform (Postgres + ClickHouse) instead of one. Acceptable cost.
Why one cluster, two purposes (not two clusters)¶
Audit and analytics share the same write path (Sense Worker, Kafka-fed batch insert) and similar query shapes. Splitting them into two physical clusters would double operational cost without solving a real problem. If audit retention requirements diverge wildly (e.g., audit must live 7 years, analytics 2), we can split per-table TTL within the same cluster — no cluster split needed.
Why plain ReplicatedMergeTree + client-side DELETE + INSERT¶
All four tables use plain ReplicatedMergeTree. Idempotency for analytics is handled at the writer: the worker issues a lightweight DELETE by primary key (with mutations_sync = 1, allow_nondeterministic_mutations = 1) before every INSERT, so re-deliveries from Kafka leave a single row.
- We tried
ReplacingMergeTree(updated_at)first — it was attractive because the engine collapses duplicates during background merges. The problem: querying it correctly requiresSELECT … FINAL, andFINALcost killed read latency once the working set grew. - Plain
ReplicatedMergeTreekeeps reads fast. The trade-off is a heavier write path (oneDELETEmutation per upsert) and the requirement that we own the dedup logic in code. - Audit follows the same engine but the contract is stronger: every audit row is unique and must never be coalesced. Duplicates are kept out at the producer (Kafka idempotent producer + at-most-once acceptance per
request_id).