Deployment Topology¶
The platform deploys to Kubernetes via Helm. Charts live in axion.infra and per-service deploy/ folders. Updates are pushed via GitLab CI (helm upgrade --install). There is no GitOps controller — pipelines are the source of truth.
Kubernetes layout¶
The cluster owns every workload that the platform team operates: the three product backends (Sense, Vision, Gen), the two web apps, OpenFGA, the Citylens legacy adapter, the ClickHouse cluster (Altinity-operator-managed), and the in-cluster Observability Service (OTel Collector + SigNoz).
Everything in the External group is reached over the network — whether it lives in another VPC, a managed cloud service, or a customer-side system is a deployment-time choice. The platform talks to it via env-driven connection strings; we never put PG, Kafka, S3, or any external SaaS in the same namespace as our pods.
Helm umbrella in axion.infra¶
The axion.infra repo is a thin umbrella over upstream charts. One Helm release per service, deployed independently from services/<name>:
| Service | Helm dependency | Chart version | App version |
|---|---|---|---|
| ClickHouse | altinity/altinity-clickhouse-operator |
0.3.11 | ClickHouse 26.3 |
| OpenFGA | openfga/openfga |
0.3.2 | OpenFGA 1.14.2 |
| SigNoz (Observability Service) | signoz/signoz |
0.119.0 | SigNoz 0.119.0 |
Postgres, Kafka, S3, the OIDC provider, and external map-matching / ML / Langfuse services are not in
axion.infra. They're managed externally (cloud-managed services, customer infrastructure, or 3rd-party SaaS). The platform consumes them via env-based connection strings.
Pipelines and migrations¶
Each service repo runs three Kubernetes resource flavors:
- Migration job — runs once per release to apply schema changes.
dotnet Axion.Sense.Api.dll --migrate— migratesaxion_sense, ClickHouse, and OpenFGA model.dotnet Axion.Sense.Worker.dll --migrate— migratesaxion_sense_tasksand ensures Kafka topics.dotnet Axion.Gen.Api.dll --migrate— migrates Gen DB.- Long-running deployments — API + Worker + Web pods.
- CronJobs / Hangfire — Sense Worker hosts the scheduler in-process (no external CronJobs needed).
OpenFGA model versions are tracked in the Postgres openfga_model_versions table by SHA-256 of the serialized model — re-applying the same migration is a no-op.
Network boundaries¶
- All in-cluster service-to-service traffic stays inside the
axionnamespace (cluster DNS:*.axion.svc). - The ingress terminates TLS for all browser-facing endpoints. gRPC for mobile uses HTTP/2 — confirm the ingress controller supports it (NGINX with
nginx.ingress.kubernetes.io/backend-protocol: GRPC). - Postgres and Kafka are reachable only via their connection strings (typically VPC-internal); the platform never opens them to the public internet.
- S3 traffic from mobile clients uses presigned URLs — they hit the storage provider's public endpoint directly.
Resource quotas¶
See Resource Quotas for the per-pod CPU/RAM table.
Rollback strategy¶
- Schema-only rollback: revert the migration commit, redeploy. Migrations are written to be additive;
DROP COLUMNis rare and is a separate explicit step. - Image rollback:
helm rollback <release> <revision>for any service. Pinned image tags (we don't uselatestin production). - Topic schema rollback: KafkaFlow short-name headers mean producers and consumers can be deployed in different orders; messages with unknown short names are deserialized to
nulland silently acked instead of crashing the consumer.