Optimization & governance¶
Attribution is the prerequisite; optimization and governance are what you do with it. Venturi identifies where AI spend can be reduced without risking output quality, surfaces duplicate and orphaned consumption, and gives you policy and budget guardrails to govern AI use across the organization. Every recommendation is advisory by default — Venturi never changes your systems without human approval.
What you get
- Quality-equivalent model-swap recommendations, each verified before it's shown.
- Duplicate-path and orphaned-spend detection with clear ownership signals.
- Budget and policy governance with a deliberate progression from passive monitoring to advisory to active — every step opt-in.
- A full governance record on every recommendation: what was concluded, with what confidence, on what evidence, and who can act on it.
- Savings you can realize and defend — once an optimization is applied, its savings are reconciled against your actual bill, not left as a projection, and the realization receipt is the only thing a success fee can bill on.
Optimization recommendations¶
Venturi turns the attribution graph into concrete, defensible opportunities:
| Opportunity | What Venturi does |
|---|---|
| Right-sizing (model swap) | Identifies over-provisioned workloads and recommends a cheaper model verified quality-equivalent on your own shadow data before the recommendation is ever surfaced. |
| Duplicate detection | Detects duplicate inference paths doing the same work, so you can consolidate. |
| Orphan cleanup | Surfaces deployments contributing spend with no owner, flagged as unknown-ownership / unknown-spend. |
No recommendation without equivalence verification
Venturi never recommends a model swap on the promise of savings alone. A cheaper model is only suggested after it is verified quality-equivalent on the specific workload, using shadow data. Language is conservative throughout: you see an "optimization opportunity," never a "savings guarantee."
Each recommendation carries a rationale, an estimated saving, and the full interpretation metadata — confidence, evidence basis, and the data it rests on — so an engineer can evaluate it before acting.
Savings eligibility on every recommendation¶
Not every opportunity rests on the same strength of evidence, so every recommendation discloses a savings-eligibility state computed from the same confidence object that gates chargeback. You always know which bucket a recommendation sits in before you act on it:
| State | Confidence | What it means for you |
|---|---|---|
| Billable | coper ≥ 0.80 | The underlying attribution is chargeback-grade. If you apply this swap, the realized savings can enter the verified-savings base. |
| Advisory only | 0.50–0.79 (the twilight band) | A real, actionable opportunity, but the attribution isn't yet chargeback-grade. Apply it if you like — it just won't be counted toward a success fee until the attribution qualifies. |
| Low confidence | coper < 0.50, or less than 7 days of observation | Surfaced for awareness; not a basis for savings claims yet. |
The twilight band is honest, not hidden
Recommendations in the 0.50–0.80 band can look just as confident as chargeback-grade ones. Venturi labels them explicitly so you can tell which recommendations rest on chargeback-grade attribution and which don't. An advisory-only recommendation that later improves to coper ≥ 0.80 is automatically re-qualified for the verified-savings base — you don't lose the opportunity, it just becomes billable once the evidence catches up.
Advisory by default, active by choice¶
Optimization follows a deliberate, opt-in maturity progression. No mode change ever blocks production AI traffic.
graph LR
P[Passive<br/>observe & attribute] --> A[Advisory<br/>recommend with rationale]
A --> AC[Active<br/>routing, opt-in]
- Passive — Venturi observes and attributes. Nothing is recommended or changed.
- Advisory — Venturi recommends optimizations with rationale and verified equivalence. You decide what to apply.
- Active — for opted-in workloads, Venturi can route to a verified-equivalent model. This is explicit, reversible, and never the default.
Each transition is opt-in, and the interceptor remains fail-open at every stage — moving to a more active mode can never cause an AI request to be blocked.
The intervention lifecycle¶
Every recommended or applied action — an "intervention" — moves through an explicit, auditable lifecycle with human control at each step:
- A recommendation starts as
PROPOSED. - A human with the right role approves or rejects it.
- High-impact actions (for example, enabling active routing or a large budget change) require N-eyes approval and respect separation of duties — the person who proposes a change cannot be its sole approver.
- Approved actions become
ACTIVE, thenCOMPLETED, with the full history preserved in the audit log.
Human-controlled by construction
No AI subsystem in Venturi ever makes an autonomous change to your systems. Optimization recommendations are explainable, confidence-gated, human-reviewable, auditable, and reversible. Categories that are prohibited from automation are blocked by construction — not by a setting you could accidentally flip.
Savings you can realize and defend¶
A projected saving is a hypothesis. Once you apply an intervention, Venturi tells you what you actually saved — reconciled against your real bill — and hands you a receipt you can defend to finance, to procurement, or to an auditor without Venturi in the room.
For every applied intervention, Venturi computes realized savings as the usage-normalized counterfactual cost (what the workload would have cost under the old model) minus the reconciled actual cost after the change, over each true-up period. The result is a deterministic, versioned savings-realization receipt — the savings analogue of your chargeback receipt — rendered in the evidence drawer and linked directly from the success-fee line on your bill.
What the savings-realization receipt shows you
- The frozen baseline window and who froze it, so the counterfactual can't drift under you.
- The usage-normalization basis — savings are measured per workload volume, so organic growth in usage is never counted as savings.
- Per-period counterfactual and reconciled-actual figures, and the realized-savings delta between them.
- The share of underlying spend that is chargeback-grade (coper ≥ 0.80) — the only spend that counts toward the savings base.
- The equivalence assertion behind the swap, and the methodology version, pinned to a specific model version.
Realized, not projected — reconciled against the bill¶
The receipt is the sole input to any savings-share billing. Venturi never bills on a projection or a dashboard assertion:
- Spend below the coper ≥ 0.80 chargeback floor is excluded from the savings base, with the reason shown on the receipt. A success fee never rests on attribution that isn't chargeback-grade.
- If the counterfactual or the volume normalization can't be resolved, the line is held from billing rather than billed on a silent default — an honest unknown, never a guess.
Degraded attribution never inflates a saving¶
If Venturi's gateway ever falls back to its fail-open path on the AI hot path, those attributions are marked degraded — and degraded spend is excluded from both the frozen counterfactual and the post-change actual. The receipt reports the share of degraded spend it had to exclude. If that share crosses a configurable threshold (10% by default), the receipt is marked provisional and is not billable until the window is re-observed under normal capture.
A degradation episode can't masquerade as a saving
A success fee billed partly on spend the gateway fell back on would be the most disputable number Venturi could produce. By excluding degraded attribution outright, a realized-savings figure can never be an artifact of a fallback episode that happened to overlap your post-change window.
Equivalence claims are workload-scoped, dated, and confidence-bearing¶
The equivalence behind a billable swap isn't "it's on a list." Every equivalence assertion that enters the verified-savings base carries the specific task type and workload scope it's asserted for, an assertion date and source version, a strength/confidence, and — for empirical shadow evaluations — the sample and the metric deltas. The receipt embeds the exact assertion it relied on.
If a swap's equivalence assertion is out of scope for your workload's task type, or has gone stale beyond a configurable freshness window, the recommendation is automatically downgraded to advisory-only and removed from the billing base until it's re-verified. When you dispute a success fee, the receipt has something workload-specific to point to — not a generic claim of "verified."
Energy and carbon, realized the same way¶
Where a swap carries an energy or carbon differential, it is computed only over the chargeback-grade (coper ≥ 0.80) workload it actually applies to, and labeled with whether the evidence is shadow or production. When the swap is applied, the realized carbon delta is reconciled into the same savings-realization receipt using your actual post-change energy profile — never the projected one. Where a model isn't catalogued for energy, the carbon delta is shown as unknown rather than silently treated as zero, so the energy-aware leg holds up under procurement challenge.
Dispute a saving the same way you dispute an attribution¶
A savings-realization receipt — or the success-fee line it backs — is disputable through the same workflow as an attribution. The dispute doesn't mutate the receipt; it resolves to one of three outcomes:
- a corrected counterfactual or normalization, re-billed with an append-only audit trail and a re-verification that you're never double-charged;
- accept the figure as billed; or
- close as duplicate.
The savings number is finance-grade, not a CEO-dashboard claim
The success-fee model bills on verified realized savings against a frozen counterfactual — and every step is addressable, exportable, and contestable. The first contested invoice resolves through a structured dispute path, not a manual escalation, because the number was audit-grade from the start.
For the confidence floors, retention, and capture guarantees these receipts inherit from attribution, see Cost attribution & chargeback and Trust & security.
Budget governance¶
Govern AI spend with per-team and per-project budgets:
- Advisory by default. Budgets alert when consumption crosses configured thresholds, without blocking anything.
- Hard-stop by opt-in. Where you choose, a budget can enforce a hard stop — an explicit, deliberate setting, never on by default.
- Threshold alerts fire at 75% / 100% / 150% of plan, delivered to email, Slack, Microsoft Teams, or a signed webhook.
Budget configuration, breach handling, forecasting, and alert routing are covered in depth in Budgets & alerts.
Governance you can demonstrate¶
For any single attribution or recommendation, Venturi can show a CISO, a FinOps lead, an auditor, or a regulator exactly what the AI concluded, with what confidence, on what evidence, with which model version, who could override it, and where the human-control boundary sits — reachable in a couple of clicks from the result. The governance console surfaces model drift and the current state of every deployed model, and disputes and overrides are first-class, audited workflows.
Venturi's published scope and non-goals match what the system actually enforces, so the boundary you're shown is the boundary that holds.
Related capabilities¶
- Cost attribution & chargeback — the attribution that makes optimization possible.
- Budgets & alerts — budgets, thresholds, forecasting, and alert routing in full.
- Adoption intelligence — reallocate enablement budget toward teams with traction.
- Trust & security — fail-open, read-only, and no content capture.