PII Vaulting Architecture: How We Achieve 99.9% Recall
PII Vaulting Architecture: How We Achieve 99.9% Recall
When we set out to build Autark's PII vaulting system, we had one non-negotiable requirement: zero false negatives at production scale. A single missed SSN or API key could mean regulatory action, customer trust erosion, or worse.
The Detection Pipeline
Our detection engine runs a multi-pass architecture:
- Pattern matching — Fast regex-based detection for structured PII (SSNs, credit cards, phone numbers)
- NER classification — A fine-tuned named entity recognition model for unstructured PII (names, addresses, medical terms)
- Contextual analysis — Semantic understanding of surrounding context to catch edge cases
Each pass runs in parallel on dedicated hardware. Total processing time: under 12ms for a 200K token payload.
The Vault
Detected entities are replaced with deterministic vault tokens ([VAULT_ID_xxx]) and stored in an encrypted local vault within your VPC. The vault supports:
- AES-256 encryption at rest
- Automatic TTL-based expiry
- Full audit logging for compliance
When the LLM response arrives, Autark rehydrates the vault tokens with the original data — achieving 100% rehydration accuracy.
Benchmarks
Tested on 5 million tokens of real corporate communications:
| Metric | Score | |--------|-------| | Recall | 99.9% | | Precision | 99.7% | | Rehydration | 100% | | P99 Latency | 11.2ms |