Back to Blog

PII Vaulting Architecture: How We Achieve 99.9% Recall

Alex Sterling

PII Vaulting Architecture: How We Achieve 99.9% Recall

When we set out to build Autark's PII vaulting system, we had one non-negotiable requirement: zero false negatives at production scale. A single missed SSN or API key could mean regulatory action, customer trust erosion, or worse.

The Detection Pipeline

Our detection engine runs a multi-pass architecture:

  1. Pattern matching — Fast regex-based detection for structured PII (SSNs, credit cards, phone numbers)
  2. NER classification — A fine-tuned named entity recognition model for unstructured PII (names, addresses, medical terms)
  3. Contextual analysis — Semantic understanding of surrounding context to catch edge cases

Each pass runs in parallel on dedicated hardware. Total processing time: under 12ms for a 200K token payload.

The Vault

Detected entities are replaced with deterministic vault tokens ([VAULT_ID_xxx]) and stored in an encrypted local vault within your VPC. The vault supports:

  • AES-256 encryption at rest
  • Automatic TTL-based expiry
  • Full audit logging for compliance

When the LLM response arrives, Autark rehydrates the vault tokens with the original data — achieving 100% rehydration accuracy.

Benchmarks

Tested on 5 million tokens of real corporate communications:

| Metric | Score | |--------|-------| | Recall | 99.9% | | Precision | 99.7% | | Rehydration | 100% | | P99 Latency | 11.2ms |