Skip to content

Gen — Data Model

Gen has its own Postgres database (axion_gen) for metadata: data sources, datasets, dashboards, widgets, table-profile cache, agent artifacts, OAuth tokens, and a few bookkeeping tables. The actual analytical rows live in the federated systems (ClickHouse, BigQuery, Impala, Flight SQL, S3 Parquet) and are queried in-place — Postgres is metadata storage, not analytics storage.

EF Core 10 owns the public schema; Hangfire owns its own schema in the same database. Authorization tuples live in OpenFGA, not in Postgres. Big-data tables (the actual rows users query) are not stored in Postgres.

Entity groups

Entities live under src/Axion.Gen.Data/Entities and are mapped directly as the EF model — no separate DTO/data-model layer. Naming: snake_case at the database level, PascalCase in C#. JSONB columns use JsonElement.

Group Tables
Identity & roles users, roles, user_roles. Built-in roles Admin (id=1) and Analyst (id=2) are seeded; custom roles can declare a parent_role_id for permission inheritance.
Data org data_sources, datasets, dataset_groups, dataset_maps. DataSource carries connector type, encrypted credentials, and connector-specific options. Dataset belongs to one DataSource and groups a logical set of tables.
Big-data caches & integrations table_profile_cache (column lists / stats / sample values per remote table — populated by TableProfileRefreshJob), external_apis, user_external_api_tokens (third-party OAuth integrations).
Dashboards & filters dashboards, dashboard_groups, dashboard_widgets, dashboard_filters, table_filters.
Curation, search & semantics curation_overrides, dkb_article_versions, dataset_semantic_metrics.
Agents & SDUI chat_sessions, tooltip_templates, agent_artifact_versions, agent_discovery_checkpoints, agent_audit_entries.
Favorites user_favorite_dashboards, user_favorite_datasets, user_favorite_tables.
Auth bookkeeping openfga_model_versions (records the SHA-256 of every authorization model uploaded to OpenFGA).

Encryption (data_sources and user_external_api_tokens)

Both tables share a 4-column split layout — encrypted_*, nonce, tag, key_version — so the encryption metadata is explicit and queryable (e.g. "find rows still on key version 1").

  • Algorithm: AES-256-GCM (AesGcmEncryptor); 32-byte key, 12-byte fresh nonce per call, 16-byte tag, no AAD.
  • Versioning: DataSources:Encryption:Keys is a version → base64(32 bytes) map; CurrentVersion is used for new encryptions.
  • What's encrypted: only properties marked [ConnectorSetting(IsCredential = true)] are split out into the encrypted half (data sources); refresh tokens are encrypted as raw UTF-8 bytes (no JSON wrapping).
  • Rotation: insert a new key version, bump CurrentVersion, run the rotate-keys-job Helm Job to re-encrypt rows. Decrypt with a missing version throws fail-fast.

Other secrets — OIDC client secrets, Postgres credentials, S3 keys — are mounted from Kubernetes Secrets directly and never written to Postgres.

Authorization tuples (OpenFGA)

Authorization is delegated entirely to OpenFGA; Postgres only records which authorization model is active.

system:axion-gen
  └─ datasource:{name}
       └─ dataset:{guid}
            └─ table:{datasetGuid}/{tableName}

Parent tuples are written by CreateDataSourceOperation, CreateDatasetOperation, and GetTablesOperation (the last lazily on first listing) and back-filled by OpenFgaMigrationRunner.EnsureEntityParentTuples. Roles (role:{id}) carry permission flags as wildcard tuples (has_admin_datasource, has_view_dataset, has_view_table); resource access is a single assigned_role tuple per (resource, role).

Why a separate database (not a separate cluster)

Gen and Sense share the same Postgres cluster, but separate databases. We don't need cross-database joins — Gen reaches Sense's analytics via the ClickHouse connector, not via Postgres cross-DB access. Two databases give us:

  • Independent backup schedules.
  • Independent migration cadence.
  • Per-database role grants (Gen's role can't read Sense rows, period).

What's not here

Data Where it actually lives
Tracks, frames, detections ClickHouse axion_sense — Gen reads via the ClickHouse connector.
Audit ClickHouse axion_sense.audit_log — same.
Authorization tuples OpenFGA's own Postgres database.
PMTiles bundles S3 (tiles/{datasetId:N}/{tableName}/{mapId:N}.pmtiles).
Hangfire job state Same Postgres instance, separate schema.

Gen never duplicates the analytical data. When the user asks a question that needs Sense data, the federated connector reaches across to ClickHouse and streams the result.