Skip to content

Deployment Topology

The platform deploys to Kubernetes via Helm. Charts live in axion.infra and per-service deploy/ folders. Updates are pushed via GitLab CI (helm upgrade --install). There is no GitOps controller — pipelines are the source of truth.

Kubernetes layout

Deployment Topology

The cluster owns every workload that the platform team operates: the three product backends (Sense, Vision, Gen), the two web apps, OpenFGA, the Citylens legacy adapter, the ClickHouse cluster (Altinity-operator-managed), and the in-cluster Observability Service (OTel Collector + SigNoz).

Everything in the External group is reached over the network — whether it lives in another VPC, a managed cloud service, or a customer-side system is a deployment-time choice. The platform talks to it via env-driven connection strings; we never put PG, Kafka, S3, or any external SaaS in the same namespace as our pods.

Helm umbrella in axion.infra

The axion.infra repo is a thin umbrella over upstream charts. One Helm release per service, deployed independently from services/<name>:

Service Helm dependency Chart version App version
ClickHouse altinity/altinity-clickhouse-operator 0.3.11 ClickHouse 26.3
OpenFGA openfga/openfga 0.3.2 OpenFGA 1.14.2
SigNoz (Observability Service) signoz/signoz 0.119.0 SigNoz 0.119.0

Postgres, Kafka, S3, the OIDC provider, and external map-matching / ML / Langfuse services are not in axion.infra. They're managed externally (cloud-managed services, customer infrastructure, or 3rd-party SaaS). The platform consumes them via env-based connection strings.

Pipelines and migrations

Each service repo runs three Kubernetes resource flavors:

  1. Migration job — runs once per release to apply schema changes.
  2. dotnet Axion.Sense.Api.dll --migrate — migrates axion_sense, ClickHouse, and OpenFGA model.
  3. dotnet Axion.Sense.Worker.dll --migrate — migrates axion_sense_tasks and ensures Kafka topics.
  4. dotnet Axion.Gen.Api.dll --migrate — migrates Gen DB.
  5. Long-running deployments — API + Worker + Web pods.
  6. CronJobs / Hangfire — Sense Worker hosts the scheduler in-process (no external CronJobs needed).

OpenFGA model versions are tracked in the Postgres openfga_model_versions table by SHA-256 of the serialized model — re-applying the same migration is a no-op.

Network boundaries

  • All in-cluster service-to-service traffic stays inside the axion namespace (cluster DNS: *.axion.svc).
  • The ingress terminates TLS for all browser-facing endpoints. gRPC for mobile uses HTTP/2 — confirm the ingress controller supports it (NGINX with nginx.ingress.kubernetes.io/backend-protocol: GRPC).
  • Postgres and Kafka are reachable only via their connection strings (typically VPC-internal); the platform never opens them to the public internet.
  • S3 traffic from mobile clients uses presigned URLs — they hit the storage provider's public endpoint directly.

Resource quotas

See Resource Quotas for the per-pod CPU/RAM table.

Rollback strategy

  • Schema-only rollback: revert the migration commit, redeploy. Migrations are written to be additive; DROP COLUMN is rare and is a separate explicit step.
  • Image rollback: helm rollback <release> <revision> for any service. Pinned image tags (we don't use latest in production).
  • Topic schema rollback: KafkaFlow short-name headers mean producers and consumers can be deployed in different orders; messages with unknown short names are deserialized to null and silently acked instead of crashing the consumer.