Data privacy & retention¶

Venturi is built around data minimization: it collects the metadata it needs to attribute AI consumption and nothing more. This page describes exactly what Venturi does and does not process, how it classifies data, how long it retains it, and how erasure works — by crypto-shredding the encryption keys rather than scrubbing immutable stores.

What Venturi does not collect¶

No prompt or completion content is ever stored

Content inspection is disabled by default. Venturi never captures request or response bodies; the canonical InvocationEvent schema has no field for message content. The attribution graph is derived state built from invocation metadata — identity, service, project, cost, tokens, timing — not from prompts or completions. This is a frozen decision.

In addition:

Provider API keys are never stored in full. Any API key appears only as a truncated, non-reversible prefix.
Provider admin keys never reach the control plane in plaintext. Where you supply a provider admin key for usage/cost ingestion, it is KMS-encrypted inside your trust boundary, and a log-guard rejects any line that would emit it.

A field-level PII classification of the InvocationEvent schema is maintained and verified in CI, so that the personal-data footprint is auditable and minimized by design (GDPR Art. 5(1)(c) data minimization).

Cohort-only adoption intelligence¶

Venturi's adoption-intelligence reporting is aggregate and cohort-level by construction — it does not profile, score, or infer the state of any individual worker.

Cohort-only, with a minimum cohort size of 5.
Sub-cohort suppression and roll-up — cohorts below 5 are suppressed and rolled into the parent unit; they are never displayed.
Anti-differencing — reconstruction by differencing across overlapping cohorts is blocked.
Individual-level views are off by default. They are not a UI toggle: enabling them per tenant requires written legal sign-off plus a data-processing-agreement amendment, and they are hard-disabled in the EU regardless of any sign-off.

A documented non-doing

Venturi performs no emotion recognition and no behavioral- or affective-state inference about individual workers. Adoption signals are aggregate usage-pattern metrics over cohorts of 5 or more, never affective-state inference. This is enforced as a frozen product invariant and is the foundation of Venturi's non-high-risk posture under the EU AI Act (see Compliance).

Data classification¶

Venturi classifies every data class it handles into a four-tier scheme, which drives the encryption, access, retention, and egress rules for each class.

Data class	Tier	Handling
Customer transactional data (request/response bodies, if ever enabled)	Restricted	Disabled by default; if enabled, in-VPC only, AES-256 with your KMS key, least-privilege access, never egresses your VPC.
Provider admin keys	Restricted	KMS-encrypted in your boundary; control plane never sees plaintext; never egresses; rotated by you.
Invocation metadata (`InvocationEvent` fields)	Confidential	In-VPC, AES-256 with your KMS key, tenant-isolated; API key stored only as a truncated prefix.
`AttributionRecords`	Confidential	Per-tenant store and key; cross-tenant access rejected; operational retention then archive.
Audit log	Confidential	Append-only, Object-Locked; PII pseudonymized; compliance-period retention.
Aggregation contributions (post-anonymization)	Internal	Anonymized (cohort ≥ 5) before egress; region-pinned; non-personal once anonymized.
Model artifacts	Internal	Hash- and signature-verified at load; not customer personal data.

Retention¶

Operational retention defaults to 13 months

The operational-retention default is 13 months — chosen to cover a full annual audit cycle plus one reconciliation margin. Retention is configurable per data class within a bounded range of 30 days to 5 years.

Per-class lifecycle:

Billing data — 90 days in standard storage, then cold archive.
Attribution data — operational-retention default (13 months), then archive.
Audit log — retained under Object Lock for its compliance period (see below).
Override records — retained on an append-only topic so that manual overrides persist across graph rebuilds.

The audit trail¶

Audit-relevant events are written to a separate, append-only audit log, distinct from operational logs. Audited events include every dispute action, every model promotion, every threshold change, every aggregation enrolment or opt-out, every break-glass support access, and every administrative action against a tenant.

The audit log is shipped to a per-tenant audit bucket with Object Lock in COMPLIANCE mode for a 5-year retention window — no deletion or modification, even by a root principal, for the duration of the window.
It carries a write-only policy at write time and read-only access via a separate retrieval role.
An immutable policy-event writer records every policy decision and override, providing the tamper-evident control history that the SOC 2 program is built on.

Audit PII is pseudonymized

So that retaining the audit trail does not itself become a personal-data liability, PII in the audit log is pseudonymized — actor and subject references are opaque, non-reversible identifiers rather than raw personal data. After a subject's per-subject key is crypto-shredded (below), the corresponding audit entries are non-personal, and the audit trail is retained under a named lawful basis: the establishment, exercise, or defense of legal claims and Venturi's legitimate interest in a tamper-evident security record (GDPR Art. 6(1)(f), Art. 17(3) exemptions).

Crypto-shred erasure¶

Erasure is implemented by destroying encryption keys, not by scrubbing immutable stores.

Erasure crypto-shreds the per-subject key (and, on full offboarding, the per-tenant key). Once the key is destroyed, the data encrypted under it is permanently unrecoverable — satisfying erasure without a record-by-record physical scrub.
The erasure SLA is 30 days. Complete, verified erasure of in-scope data is committed within 30 days of a valid request or contract termination, and a deletion certificate is produced as evidence.
Key destruction is the named exception to deterministic reconstruction. Venturi's attribution graph is normally rebuildable by replaying the event stream — but replayed events whose key has been crypto-shredded cannot be re-materialized into personal data. This is the explicit, permitted exception that lets an append-only architecture honor a right-to-erasure request.

The carve-out is precise: operational and attribution data are crypto-shred-erased within 30 days; the immutable audit log and override topic are retained for their compliance period (pseudonymized, named lawful basis) and deleted on expiry; anonymized aggregation contributions are non-personal and not individually reversible. See Data-subject rights for the full request mechanics.

Offboarding & data return¶

On contract termination, Venturi runs a defined sequence that returns your data before it destroys any keys, so a departing customer never loses access to its own attribution record.

Final full export before deletion. You receive a final full export in the standard tenant-export format within 5 business days of the termination effective date, covering your AttributionRecords, your customer-readable audit trail, and the reconciliation and billing artifacts you need to operate afterward.
Post-termination retrieval window. A retrieval window — default 30 days, contract-configurable — runs after the export is delivered and before key destruction, so you can re-retrieve or validate the export on your own schedule.
Certified crypto-shred deletion. After the window closes, per-tenant (and per-subject) keys are crypto-shredded under the 30-day SLA, and a deletion certificate is issued.

The offboarding runbook fixes the export-then-delete order — export and retrieval window first, key destruction second, never the reverse — names the owner, and enumerates the evidence artifacts (export manifest and checksums, retrieval-window record, deletion certificate) written to your audit trail.

Data-subject rights — access, erasure, and portability request mechanics.
Tenant isolation — per-tenant stores and keys, and the cross-tenant aggregation model.
Residency & subprocessors — where data lives and region pinning.
Compliance — GDPR, CCPA, and EU AI Act posture.