Gen — Overview¶
Gen is the analytics product. Where Sense collects data, Gen uses it.
Despite the suggestive name, "Gen" is not for generation — it's the dashboards-and-LLM-agent surface for self-service analytics over Sense data, plus federated external sources (BigQuery, Flight SQL, Parquet on S3).
What ships in Gen¶
| Repo | What it is |
|---|---|
axion.gen.backend |
.NET 10 services. Two binaries from one solution: Gen API (REST, OIDC, federation engine) and Gen Worker (Hangfire jobs, Tippecanoe). |
axion.gen.web |
Next.js 15 app. Consumer dashboards + AI agent panel. |
Subsystems¶
flowchart LR
user[Analyst] --> web[Gen Web]
web --> bff[Next.js BFF
/api/agent/stream]
bff --> llm[LLM Provider
OpenAI-compatible]
bff --> tools[Tool calls]
tools --> api[Gen API]
api --> duckdb[DuckDB.NET
in-process
federation engine]
api --> db[(Postgres
metadata + Hangfire)]
api --> fga[(OpenFGA)]
duckdb --> ch[(ClickHouse)]
duckdb --> bq[(BigQuery)]
duckdb --> impala[(Impala)]
duckdb --> fsql[(Flight SQL)]
duckdb --> s3[(S3 Parquet)]
api --> worker[Gen Worker
Hangfire + Tippecanoe]
worker --> s3
web --> map[2GIS MapGL
tiles]
bff --> langfuse[Langfuse
LLM traces]
What's distinctive about Gen¶
The AI agent¶
The Gen Web app embeds a chat panel backed by a server-side LLM gateway at /api/agent/stream (using @ai-sdk/openai). The agent has a deterministic toolbox executed client-side via src/shared/lib/agent/tool-executor.ts:
| Tool | What it does |
|---|---|
ask_data |
Server-side natural-language → SQL via the configured LLM. |
query |
Run an arbitrary SQL query against a registered data source. |
chart |
Render a chart (ApexCharts / ECharts / Recharts / Vega) from query results. |
create_dashboard |
Persist a new dashboard with widgets. |
update_canvas |
Modify the current dashboard canvas (positions, sizes, configs). |
map_visualization |
Render a Deck.gl map layer. |
list_dashboards |
Enumerate the user's dashboards. |
describe_table |
Inspect schema for a data source. |
Tool inputs and outputs are validated by Zod schemas in src/shared/lib/agent/tool-schemas.ts.
Modes (env-controlled):
- LLM_PROVIDER: openai_compatible (production) | mock (e2e tests).
- AGENT_MODE: ask_only | full — ask_only lets the agent answer questions about data without modifying the dashboard.
DuckDB-backed cross-source SQL (in Gen API)¶
Gen API hosts a single in-process DuckDB instance (DuckDbConnectionProvider, with json / httpfs / spatial / polyglot / nanoarrow / h3 extensions pre-loaded). For a user query that touches one connector with no alias, Gen takes a fast path and pushes the SQL straight to the connector. For multi-source queries it materializes each referenced dataset into a per-query DuckDB scratch schema and runs the join inside DuckDB. Results stream back as Arrow IPC.
DuckDB does not live in the browser; charting libraries (Vega/Recharts/ECharts) consume the API response directly.
Federated data sources¶
Gen API connects to external systems through the IBigDataConnector family:
- ClickHouse — native HTTP client (
Arrow Stream IPCfor execution) - BigQuery — Apache Arrow ADBC driver
- Impala — Apache Arrow ADBC driver
- Flight SQL — Apache Arrow ADBC driver
- S3 Parquet — DuckDB
httpfs+spatial+h3
Connectors are auto-discovered by reflection from the [Connector(...)] attribute. Credentials marked [ConnectorSetting(IsCredential = true)] are split off and stored encrypted at rest under AES-256-GCM with versioned keys. See axion.gen.backend/docs/Connectors.md and Encryption.md for the contract and threat model.
OpenFGA-backed authorization¶
Gen authorization runs through OpenFGA, sharing the cluster with Sense but using its own model. The object hierarchy is system:axion-gen → datasource:{name} → dataset:{guid} → table:{datasetGuid}/{tableName}, and roles carry permission flags (has_admin_datasource, has_view_dataset, has_view_table) as wildcard tuples. Before a query executes, QueryPermissionService issues a single BatchCheck for can_view on every referenced table — if any fails, the whole query is rejected before any connector is touched.
The startup OpenFgaMigrationRunner SHA-hashes the embedded model.fga, uploads new versions in development, and only resolves an existing hash in production.
Auth¶
Gen uses the same OIDC IdP as Sense, but with a BFF callback proxy pattern (per axion.gen.backend/docs/Oidc.md): the Next.js app issues its own session cookie, the backend validates the JWT. This keeps tokens out of the browser's local storage.
Background jobs¶
The Gen Worker process runs Hangfire on Postgres (the API is registered as a Hangfire client only). Three jobs today:
- TableProfileRefreshJob — refreshes
table_profile_cache(column metadata, stats, sample values) via the connector. - MapTilesGenerationJob — pages the source table, runs Tippecanoe, uploads PMTiles to S3.
- OntologyTilesGenerationJob — produces ontology PMTiles for a dataset.
Tippecanoe lives only in the Worker image. The OIDC-gated /hangfire dashboard runs on the Worker.
Where to go next¶
- Gen Containers (L2)
- Gen API Components (L3)
- Gen Data Model