Skip to content

Disaster recovery

Venturi's disaster-recovery posture protects your attribution records, event history, audit log, configuration, and the serving plane that exposes them. It is built on a single, propagated recovery pair — RPO ≤ 15 minutes, RTO ≤ 1 hour — underwritten by an event-sourcing design that makes most of the system a derived artifact you can rebuild from a durable source of truth.

Disaster recovery in one page

  • RPO ≤ 15 minutes — the maximum data you could lose in a recovery event.
  • RTO ≤ 1 hour — the maximum time to restore the serving plane.
  • One pair, propagated identically across every store — no confusing per-component variation.
  • Your AI traffic keeps flowing throughout. The fail-open gateway forwards live requests even while Venturi is recovering.

Recovery objectives

Objective Target Scope
Recovery point objective (RPO) ≤ 15 minutes Maximum data loss across tenant attribution stores and event logs
Recovery time objective (RTO) ≤ 1 hour Time to restore the serving plane
Gateway behavior during recovery Fail-open Customer AI traffic is forwarded unmodified throughout

This RPO/RTO pair is propagated identically everywhere, so you reason about one durability commitment, not a matrix of per-store numbers. Your order form is the authority for any customer-specific binding commitment; the targets here are the platform's standard objectives.

Recovery never touches your AI traffic

A disaster-recovery event degrades your visibility into attribution for a bounded window — it does not stop your production inference. Because the decision-time gateway is fail-open with a hard 50 ms budget, live AI requests are forwarded unmodified even while Venturi is restoring. The RTO governs how long until you can read Venturi again, never how long your AI is down. See SLAs & SLOs.

The recovery design: event sourcing

The reason the RPO can be tight is architectural. Venturi treats the durable, replicated invocation log as the source of truth, and the attribution graph and the materialized index as derived projections of it. That gives every store a recovery story matched to its role.

Store Role Recovery
Invocation log Durable, ordered, replicated source of truth Replication tolerates node loss; tiered storage retains history for disaster recovery
Attribution graph Resolved identity → organization → budget structure Restored from snapshot; can be fully rebuilt by replaying the invocation log
Materialized index Fast read surface for dashboards and the API Architecturally rebuildable; reconstructed from the graph in seconds, no separate backup required
Object storage Audit log, billing reports, export artifacts High intrinsic durability with versioning enabled

Because the graph and index are derivable, a catastrophic loss of a serving store is recovered by replay, not by hoping a single backup is intact — the durable log lets Venturi reconstruct attribution deterministically up to the recovery point.

Backup and restore

  • Automated backups cover every persistent store: the event log, the attribution graph, configuration, audit log, and export metadata.
  • Restores are validated before the service is marked healthy. A restore verifies both data integrity and tenant isolation — a recovered serving plane is never returned to service until cross-tenant boundaries are confirmed intact. Tenant isolation fails closed during recovery exactly as it does in normal operation.
  • The audit log is preserved across recovery. Backup and restore operations are themselves recorded, so the immutable audit trail spans recovery events. See Audit logs.

Crypto-shred erasure survives recovery

A subject erased by crypto-shred stays erased after a restore: the per-subject key is destroyed, so the subject's content is cryptographically unrecoverable even from backups. Recovery restores the system to a consistent state — it does not resurrect data that was lawfully erased.

Rehearsal and evidence

Recovery is tested on a schedule appropriate to the deployment and the procedure is documented and rehearsed, not assumed. Each rehearsal produces evidence you can review:

  • Timestamps for the recovery run.
  • The components restored.
  • Validation results — integrity and tenant-isolation checks.
  • Any corrective actions taken.

This rehearsal evidence is part of Venturi's availability and recovery posture and is available through the Trust Center for security and procurement review.