How the runtime is built

Architecture

Three subsystems carry the load: the runtime executes, the context graph remembers, the trust kernel authorizes. Everything else — files, webhooks, evals — hangs off those three.

The runtime

A run is the unit of execution. The planner (y0-deep for complex work) compiles the prompt into an execution graph — typed steps with declared inputs — and the executor walks it under an enforced max_steps ceiling. Every step emits into the trace as it happens; the trace is written ahead of the result, so even a failed run explains itself.

The context graph

Everything ingested — documents, calendar items, transcripts, prior traces — is embedded on write (y0-embed) and stored as typed nodes with entity and time edges. Retrieval is hybrid: vectors rank, filters constrain, edges expand. The graph is per-project; there is no cross-tenant index of any kind.

The trust kernel

The kernel sits between the executor and every side effect. Each fetch and action presents a scope; the kernel checks it against the key, logs the decision to an append-only audit log, and blocks on mismatch. It is not advisory middleware — there is no code path from a run to your data that bypasses it.

Request lifecycle

POST /v1/runs
  └─ planner: prompt → execution graph (steps, scopes needed)
  └─ kernel:  scope check per declared fetch/action
  └─ executor: step 1..n  (each step → trace, ceiling enforced)
  │     └─ context graph fetches, tool calls, model calls
  └─ result + trace persisted; webhooks fire; audit log appended

Deployment-wise the platform is region-pinned: your project's graph, traces, and files live in the region you chose at creation and never leave it. Self-hosting runs the identical stack — see /docs/self-hosting.