Skip to content

OpenFGA

Fine-grained, relationship-based authorization. Used by both Sense and Gen.

At a glance

Property Value
Version OpenFGA 1.14.2
Helm chart openfga/openfga 0.3.2 (in axion.infra/services/openfga)
Topology StatefulSet, in-cluster
Service DNS axion-openfga.axion.svc:8080 (HTTP), :8081 (gRPC)
Storage Postgres-backed (axion_sense_permissions database in the platform Postgres cluster)
C# SDK OpenFGA client, used by IPermissionService

Why OpenFGA

Plain RBAC chokes on multi-tenant, relationship-shaped questions:

  • Can user X view track Y? — not just "is X a viewer", but "is X a member of org Y's owner, AND not expired".
  • Can user X manage user Z? — only if they share an org AND X has manage_users in that org.
  • List all tracks visible to user X — needs a graph query, not a flat permission check.

OpenFGA models these as relationship tuples and provides a query API for both Check (single) and BatchCheck (many at once). See ADR-0005 for the full rationale.

Authorization model

Authored as DSL at axion.sense.backend/src/Axion.Sense.Data/OpenFga/model.fga (human-readable). Roughly:

model
  schema 1.1

type user

type organization
  relations
    define member: [user]
    define admin: [user]
    define can_view_tracks: member or admin
    define can_manage_users: admin
    define can_manage_tasks: admin

type track
  relations
    define org: [organization]
    define can_view: can_view_tracks from org

(Simplified — real model has more types: role, project, territory, dashboard, etc.)

Permissions

System-scoped

  • CanManageOrganizations
  • CanManageRoles

Org-scoped

  • CanViewOrgTracks
  • CanManageUsers
  • CanManageTasks
  • CanManageProjects

(Plus Gen-specific: CanManageDashboards, CanManageDataSources.)

Model versioning

OpenFGA stores authorization models as immutable versions; you write a new model and switch over. We pin which model is "current" via:

  • Postgres table openfga_model_versions (in axion_sense) — records (model_sha256, openfga_model_id, applied_at).
  • The Sense API migration runner serializes model.fgamodel.json (embedded resource), computes SHA-256, checks the table:
  • If hash matches the latest row → no-op.
  • If hash is new → write a new authorization model in OpenFGA, record the resulting model ID.
  • At runtime, IPermissionService always uses the latest model ID from the table.

This keeps model evolution transactional with the application code.

BatchCheck

Single-permission Check is too slow for list endpoints — N rows × 1 round-trip each is a non-starter. BatchCheck collapses many checks into one OpenFGA call:

// Before: N round-trips
foreach (var track in tracks)
    if (await fga.Check(user, "can_view", track)) ...

// After: 1 round-trip
var checks = tracks.Select(t => new Check(user, "can_view", t));
var results = await fga.BatchCheck(checks);

Used pervasively in Sense API list endpoints.

Tuple lifecycle

  • Add tuples when a user joins an org, when a track is created, when a role is granted.
  • Remove tuples in AccessExpirationJob (Worker, every 12h) — sweeps expired memberships and removes the corresponding tuples.
  • Read tuples via Check / BatchCheck only — we never list-tuples in the request path.

Where it's called

Caller Purpose
Sense API (every request) Authorize the action
Gen API (every request) Same
Sense Worker (AccessExpirationJob) Revoke expired
Sense Worker (other admin operations) Mirror state changes (rare)

Configuration

In Sense API and Gen API:

"OpenFga": {
  "ApiUrl": "http://axion-openfga.axion.svc:8080",
  "StoreId": "...",            // single store per environment
  "ModelId": "...",             // looked up at startup
  "TimeoutMs": 1000
}

Observability

  • OpenFGA emits its own metrics (Prometheus); scraped by the OTel collector → SigNoz.
  • Per-request authorization latency is exposed as a span attribute (fga.duration_ms).
  • Alert on p99 authz latency > 50ms — usually a sign of a missing index in the OpenFGA database or model bloat.

Operations

  • Model rollout: write a new model in DSL, ship a code change, the migration writes the new model and pins it in Postgres.
  • Cross-environment differences: don't. Same model in dev, staging, prod. Diverging models break audit reproducibility.
  • Tuple cleanup: routine via AccessExpirationJob. Manual cleanup via OpenFGA admin API for one-off cases.