Gen API — Components (L3)¶
Inside the Gen API container.
REST controllers¶
| Area | Endpoints (representative) |
|---|---|
| Dashboards / widgets / filters | CRUD over dashboards, dashboard_widgets, dashboard_filters, table_filters. |
| Data sources / datasets | DataSourcesController, DatasetsController, ConnectorsController — connector-typed CRUD; credentials never leave the server in plaintext. |
| Big-data SQL | BigDataController — /bigdata/query (NDJSON rows), /bigdata/query/stream (Arrow IPC), /bigdata/query/parse (AST + table refs, no execution). |
| Maps | MapsController — PMTiles generation + status, on-demand H3 hexagons / clusters, filterIds (Arrow IPC stream of IDs matching a bbox/polygon). |
| Roles & permissions | RolesController, AccessController — system admin management, role CRUD, permission flags, role membership, role-to-resource assignments. |
| Agent surface | AgentController, ChatSessionsController, TooltipTemplatesController, OntologyController, ProxyController. |
| External APIs | ExternalApisController, ExternalApiOAuthController — third-party OAuth integrations with encrypted refresh tokens. |
| Auth & favorites | AuthController, FavoritesController. |
Controllers stay thin: each operation is declared as a [FromServices] action parameter, never as a constructor dependency. DTOs are in the OpenAPI document at axion.gen.backend/openapi/v1.json. Web codegen pulls from this directly.
Auth Middleware¶
OIDC JWT bearer. Uses the BFF callback proxy pattern documented in axion.gen.backend/docs/Oidc.md:
- The Next.js app initiates the OIDC code flow.
- The IdP's redirect URL points at the Gen API (not the Next.js app).
- Gen API receives the code, exchanges it, sets an HttpOnly session cookie, and redirects back to the SPA.
- Subsequent requests carry the cookie; the API validates it server-side and resolves the JWT internally.
This pattern keeps access tokens out of localStorage — a hard requirement for many enterprise SSO deployments.
Operations layer¶
Same IOperation<TParam, TResult> pattern as Sense. Operations live under src/Axion.Gen.Core/Operations (one folder per domain: DataSources, Datasets, BigData, Dashboards, DashboardGroups, Roles, Map, Ontology, AgentDiscovery, Artifacts, Audit, ChatSessions, Curation, Dkb, ExternalApis, Favorites, Filters, Proxy, SemanticMetrics, TooltipTemplates, Users). They call into repositories, the connector factory, the encryption layer, and IPermissionService.
Authorization (OpenFGA)¶
IPermissionService is the only entry point for authorization checks and tuple writes — controllers and operations never call OpenFgaClient directly. Object hierarchy: system:axion-gen → datasource:{name} → dataset:{guid} → table:{datasetGuid}/{tableName}. Roles carry permission flags (has_admin_datasource, has_view_dataset, has_view_table) as wildcard tuples and inherit through parent. Granting access to a resource is a single tuple: assigned_role from the resource to the role; computed can_* relations make member ∧ has_* instantly effective.
Before SQL execution, QueryPermissionService extracts the table set from the parsed query and submits a single BatchCheck for can_view on every referenced table. Any denial throws UnauthorizedAccessException listing the denied tables. CTE-only queries skip authorization.
OpenFgaMigrationRunner is the startup gate: it computes a SHA-256 of model.fga, compares it against the latest openfga_model_versions row, uploads a new model when the hash changes (Run() in Development; Resolve() only in other environments), seeds built-in roles and parent tuples, and stores the active store/model id in OpenFgaRuntimeOptions.
GenDbContext¶
EF Core 10 + Npgsql + PostGIS. Entities live in src/Axion.Gen.Data/Entities and are mapped directly as the EF model — no separate DTO/data-model layer. JSONB columns use JsonElement (never JsonDocument).
Domain groups: identity & roles · data org (data_sources, datasets, dataset_groups, dataset_maps) · big-data caches (table_profile_cache, external_apis, user_external_api_tokens) · dashboards & filters · curation, search & semantics · agents & SDUI · favorites · auth bookkeeping (openfga_model_versions).
DataSource encryption¶
Credentials and OAuth refresh tokens are stored AES-256-GCM encrypted with versioned keys via AesGcmEncryptor. Two storage modes (per DataSources.md):
- Database (default) — encrypted columns in Postgres (
encrypted_credentials,nonce,tag,key_version); managed via the REST API. - Configuration — read from
appsettings.json/ env vars; CRUD throws.
Key rotation: insert a new key version into config, bump CurrentVersion, then run the rotate-keys-job Helm Job which re-encrypts rows under the new version.
Federated data-source connectors¶
Connectors implement IBigDataConnector and are auto-discovered by reflection from the [Connector(...)] attribute (AddBigDataConnectors() in DI). Each handles execution (ExecuteQuery → Arrow IPC stream), catalog (GetTables, GetTableSchema, GetTableProfile), per-column expression helpers (GetGeometryExpression, GetSelectExpression), and optional H3 aggregation.
| Connector | Driver |
|---|---|
ClickHouseConnector |
Native HTTP via IHttpClientFactory (Arrow Stream IPC for execution, JSONEachRow for metadata). |
BigQueryConnector |
Apache Arrow ADBC BigQuery driver via AdbcConnectionFactory. |
ImpalaConnector |
Apache Arrow ADBC Impala driver. |
FlightSqlConnector |
Apache Arrow ADBC Flight SQL driver. |
S3ParquetConnector |
DuckDB (httpfs/S3 + spatial + h3 extensions); per-bucket secret created lazily. |
AdbcConnectionFactory is the single entry point for ADBC-backed connectors — adding a new ADBC driver = adding a branch there. Adding an HTTP- or DuckDB-backed connector = adding a new [Connector(...)]-tagged class. Spatial bbox/polygon predicates plug in via ISpatialFilterBuilder keyed by connector type (today only ClickHouseSpatialFilterBuilder).
Where queries actually run¶
QueryExecutor has two execution paths:
- Single-dataset fast path (one dataset, no alias) — the connector receives the user's SQL verbatim and streams Arrow IPC straight into the response. Nothing is materialized in DuckDB.
- Federation path (multiple datasets, or a single dataset with an alias) — every query gets its own DuckDB scratch schema (
q_{guid}). All-S3-Parquet queries take a fast subpath usingread_parquet()views; mixed-connector queries pull each non-S3 dataset as Arrow IPC into a temp file and load it viaread_arrow_ipc(...). The user's SQL then runs against the scratch schema withsearch_pathset; the schema is dropped on exit.
flowchart LR
user[Analyst] --> web[Gen Web]
web -- "ask_data tool" --> bff[Next.js BFF]
bff -- "NL→SQL via LLM" --> llm[OpenAI-compatible LLM]
bff -- "execute SQL" --> api[Gen API]
api -- "parse + sanitize" --> parser[SqlParser
DuckDB polyglot]
api -- "BatchCheck can_view" --> fga[(OpenFGA)]
api --> exec{QueryExecutor}
exec -- "single dataset" --> conn[Connector]
exec -- "federation" --> duckdb[DuckDB
scratch schema]
duckdb --> conn
conn --> ds[(Data source)]
api -- "Arrow IPC / NDJSON" --> web
The LLM never sees the data — only the question and the schema description from describe_table. Results stream back through the API to the browser as Arrow IPC or NDJSON.