Flow — Citylens Initial Migration¶
A one-time bulk import of legacy data (tracks, frames, detections) from Citylens, an older system that pre-dates Axion Sense, into the platform.
This flow runs once per Citylens dataset. It's followed by the Citylens realtime sync which keeps the two stores aligned thereafter.
When to run¶
- Initial onboarding of a Citylens-hosting customer.
- Operator triggers it manually from the Hangfire Dashboard.
- The recurring cron is intentionally
"0 0 31 2 *"(a date that never exists) — the job is registered but the schedule never fires. - Gated behind
CitylensSyncOptions.Enabled— must be set inappsettings.jsonbefore the job is even visible.
Sequence¶
Why this shape¶
Why pause realtime consumers¶
Citylens emits realtime lifecycle events on three Kafka topics. If those are running while we're bulk-importing, a realtime event for track X may arrive before the bulk import has inserted X's mapping row in CitylensTrackMapping. The realtime middleware would either fail and block the partition or, worse, create a divergent state. Pausing the consumers eliminates the race.
Why generate UUIDv7 for axion_id¶
- Time-ordered → ClickHouse partitioning by
toYYYYMM(UUIDv7ToDateTime(id))works the same for migrated and natively-created tracks. - Preserves rough chronology of the original Citylens data (UUIDv7 contains a millisecond timestamp; we set it from the legacy
created_at).
Why mapping table and not in-place ID rewrite¶
The mapping table is the only translation point between legacy and current IDs. Realtime events keep arriving with citylens_id, and we keep translating. We never update the legacy IDs in flight.
Why bulk INSERT + writer-side DELETE (not UPSERT)¶
ClickHouse doesn't have UPSERT semantics in the OLTP sense. The importer issues a lightweight DELETE by primary key (with mutations_sync = 1, allow_nondeterministic_mutations = 1) before each bulk INSERT, so re-running the migration replays the same (id, …) tuples and ends in the same final state. ReplacingMergeTree(updated_at) was the earlier choice but the required SELECT … FINAL cost killed read latency at load.
Failure modes¶
| Failure | What happens | Recovery |
|---|---|---|
| Job crashes mid-loop | Last paginated key is recorded in Hangfire's job state; restart picks up roughly where it left off (with overlap covered by writer-side DELETE+INSERT). |
Just re-trigger the job. Consumers stay paused (flag still unset). |
| Citylens Postgres unavailable | Job fails fast; consumers stay paused. | Wait for Citylens; re-trigger. |
| ClickHouse OOM on a huge bulk insert | Insert fails; merge sort isn't affected. | Reduce CitylensSyncOptions.BatchSize; re-trigger. |
citylens_initial_migration_completed_at flag not set after a fully-successful run |
Realtime consumers stay paused indefinitely. | Manual SQL update; in practice the job sets it as the last step before reporting success. |
Idempotency invariants¶
- Mapping table is the lookup key; running again either no-ops or upgrades existing rows.
- ClickHouse inserts are at-least-once; the writer issues
DELETEby primary key +INSERT(mutations_sync = 1) so re-runs leave a single row. - Postgres flag write is the last step. If the job dies before that, re-running is safe.
Resource notes¶
- Heavy ClickHouse insert pressure for the duration of the run.
- Recommend isolating the Worker pod that runs the import (
requests.memory: 4Gi,limits.memory: 8Gi) and using a dedicated ClickHouse user with insert-only grants for the migration window — limits the blast radius if the SQL is wrong. - Plan the run during a low-traffic window — even with consumers paused, the API is still serving live writes that hit ClickHouse.
Code references¶
CitylensSyncJob:axion.sense.backend/src/Axion.Sense.Worker/Jobs/CitylensSyncJob.csCitylensKafkaStartupService:axion.sense.backend/src/Axion.Sense.Worker/HostedServices/CitylensTrackMappingentity:axion.sense.backend/src/Axion.Sense.Data/Entities/- Step-by-step canonical narrative (28 KB):
axion.sense.backend/docs/citylens-integration.md— this page is its architecture-level view.