Skip to content

How Venturi works

You don't need this page to onboard — but it explains what happens to your data after you connect, and why the integration is shaped the way it is.

Looking for the full architecture?

This page is the short conceptual overview. For the complete target-state architecture — the three planes, the HRE three-stage pipeline, the RAIL sidecar, the control-vs-data-plane split, the fail-open boundary, and exactly where your data lives and does not — see System architecture.

The attribution graph

Venturi is the attribution layer for enterprise AI. It links AI inference across providers, services, owners, identities, and budgets into one attribution graph — without manual tags. It takes every AI inference signal it can see and links it across six layers into a single graph:

graph TD
  R[AI inference request] --> S[Service]
  S --> C[Code ownership]
  C --> I[Identity]
  I --> O[Org hierarchy]
  O --> B[Budget / cost center]

The result answers questions finance and engineering both ask: which team, which service, which person, which budget is responsible for a given slice of AI spend — with a confidence score on every attribution (capped at 0.95; Venturi never asserts certainty).

As you build on that graph, Venturi can earn system-of-record status — as customers operationalize chargebacks, governance, and adoption, the attribution layer becomes the place those decisions are recorded and defended. That is a destination state you grow into, not a claim Venturi makes on day one.

Where the data comes from

Source Gives Venturi
Cloud connector (onboarding) Billed cost (CUR / BigQuery billing export / Cost Management), provider-native usage logs (Bedrock, Vertex AI, Azure OpenAI), and identity inventory.
Ingestion API / proxy (optional) Request-level events: per-call tokens, cost, latency, and the identity/service that made the call.

The connector alone produces account/team-level attribution. Adding events gives you per-request resolution.

Deployment model

Venturi runs as a dedicated data plane inside your cloud trust boundary (see CUSTOMER_DEPLOYMENT.md). This preserves three product invariants:

  1. Your operational data stays in your trust boundary.
  2. Integrations stay read-only. Venturi reads cost/usage/identity; it never writes back.
  3. The decision-time interceptor fails open — it never depends on an external control plane to let your traffic through.

Venturi's own dev / staging / production environments exist for development, certification, and release rehearsal — not for holding your data.

Fail-open, always

The single hardest rule in the platform: no code path may block customer production traffic. The gateway interceptor runs on a 50 ms end-to-end latency budget enforced with a hardware-level timeout, not application logic. If anything is slow or down, your request proceeds and attribution is reconciled later.

No content capture

The core pipeline never stores prompt or completion text. Attribution is built from metadata — model, tokens, cost, identity, timing — which is why the ingestion schema has no content field at all.

The full system architecture The trust & security model in full