The Short Version
SafeRedact uses AI to detect sensitive information in your documents. Here's exactly what that means for your data:
Original files stay local
Original PDF files stay in your browser
Text sent to AI
Extracted text is analyzed by Claude AI
No file storage
We don't store your documents anywhere
Redaction is local
Clean file created in your browser
Our Hybrid Architecture
We use a hybrid approach that balances AI accuracy with privacy protection. Here's the step-by-step process:
Text Extraction (Your Browser)
When you upload a document, your browser extracts the text using PDF.js (for PDFs) or Tesseract.js OCR (for scanned documents). The original file stays in your browser memory.
AI Detection (Cloud)
The extracted text (with position coordinates) is sent to our API, which uses Claude AI to classify which items are sensitive: SSNs, names, addresses, phone numbers, etc. This is what enables accurate, context-aware detection.
Review & Redact (Your Browser)
The AI returns classification results. You review the detections, adjust as needed, and click Redact. The clean output file is generated entirely in your browser—a new PDF without the sensitive data.
What IS Sent to the Cloud
Extracted text strings only
Source documents (PDF, DOCX, XLSX, etc.) are parsed entirely in your browser. What leaves your device is the extracted text content — the strings the classifier needs to identify PII — over TLS 1.3. The original file binary never leaves the browser; nothing is uploaded.
// What's actually transmitted:
POST /api/detect (TLS 1.3)
Content-Type: application/json
{
"textItems": [
{ "id": "t1", "text": "Patient John Smith, DOB 12/4/1972" },
{ "id": "t2", "text": "Re: Account 4521-9907" }
]
}
Only the extracted text strings are transmitted. The PDF, DOCX, or other source file binary stays in the browser.
The text is processed by Anthropic's Claude API, which does not store or train on API inputs (zero-retention headers). Data is processed in memory and immediately discarded. See Anthropic's Privacy Policy for details.
Transport Security
TLS 1.3 + Browser Isolation
The strongest claim is what doesn't leave your browser
SafeRedact's transport security rests on two facts: (1) your source document file never leaves your browser, and (2) the extracted text that does leave travels over TLS 1.3 — the same transport encryption used by banks and government services. The structural protection — keeping the document binary out of our infrastructure entirely — eliminates whole categories of breach exposure that no amount of in-flight encryption could address.
Earlier versions of SafeRedact added a custom application-layer AES-256-GCM encryption step on top of TLS, claiming defense-in-depth. On review, that layer transmitted the symmetric key in the same request body as the ciphertext, so it provided no security beyond what TLS already offers. We removed it in May 2026 and updated this whitepaper to describe what the system actually does. Transparency on architecture changes is itself a security commitment.
What protects your data
- • Source file binary never leaves the browser
- • Only extracted text strings are transmitted
- • TLS 1.3 confidentiality & integrity in transit
- • Zero-retention LLM API (no training, no logs)
- • Origin-locked CORS on every endpoint
- • Nothing persists server-side — nothing to breach
Technical details
- • Transport: TLS 1.3 (HTTPS)
- • Source files: Browser-only, never uploaded
- • LLM provider: Anthropic Claude
- • LLM retention: Zero (no-train header)
- • Server-side storage of content: None
This architecture is what enables SafeRedact to handle data subject access requests for regulated industries — finance, healthcare, legal, government — without expanding the data-handler boundary beyond your own browser.
What Is NOT Sent
Original PDF binary stays in your browser
Original PDF stays local; only extracted text is sent
Formatting, layout, fonts, embedded objects
Author, creation date, filename, etc.
Important: We do not store documents on SafeRedact servers. There is no retention period because there is nothing retained. Processing is ephemeral—when you close the tab, it's gone.
How We Compare to Other Tools
| Tool | File Upload | Detection | Retention |
|---|---|---|---|
| SafeRedact | Text only | AI (Claude) | None |
| Adobe Acrobat Pro | Local | Regex patterns | N/A |
| Smallpdf | Full file | Manual only | 1 hour |
| iLovePDF | Full file | Manual only | 2 hours |
| Redactable | Full file | AI | Account storage |
Visual: Where Your Document Goes
Stored & processed
Not stored · Discarded
SafeRedact advantage
AI-powered detection accuracy without uploading your actual documents. Other AI redaction tools require full file uploads and store documents in your account.
Tradeoff
Desktop apps like Adobe are fully local, but use basic pattern matching that misses context-dependent PII like names and addresses.
Compliance Considerations
Important Disclaimer
SafeRedact is not certified for HIPAA, GLBA, FERPA, or other industry-specific regulations. While we minimize data exposure, extracted text is processed via a third-party AI API. Organizations with strict compliance requirements should evaluate whether this meets their policies.
What this means for HIPAA
- → Text from documents is sent to Anthropic's API
- → Anthropic does not store API inputs by default
- → SafeRedact does not have a BAA in place
- → Consult your compliance officer before use with PHI
What this means for GDPR
- → Text processing involves a US-based data processor (Anthropic)
- → No persistent storage of personal data
- → Processing is ephemeral (no retention)
- → Review your DPA requirements
Frequently Asked Questions
Why not process everything locally?
We tried. Browser-based regex detection catches obvious patterns (SSNs, credit cards) but consistently misses context-dependent PII like names and addresses. AI provides dramatically better accuracy, which is the whole point of a redaction tool—you need to catch everything.
Is my data used to train AI models?
No. We use Anthropic's API, which does not use API inputs for model training by default. See Anthropic's Privacy Policy.
Why is this better than Smallpdf or iLovePDF?
Those tools upload your entire PDF to their servers and store it for 1-2 hours. We only send extracted text, and we don't store anything. Plus, they don't have AI detection—you have to manually find and mark every sensitive item.
Can I use SafeRedact offline?
Text extraction and redaction work offline, but AI detection requires an internet connection. Without it, the tool falls back to basic regex patterns (less accurate but still functional for SSNs, credit cards, etc.).
Is the redaction permanent?
Yes. SafeRedact renders your document as a new image with redaction boxes burned in. The sensitive content is never included in the output file—it's not hidden or covered, it simply doesn't exist in the exported document. The redacted content cannot be recovered by copy/paste, Photoshop, or any other method.
Start redacting in seconds
AI-powered detection with no document storage. Upload a PDF, review detections, download redacted. Free tier available.