How SafeRedact Protects Your Documents

The Short Version

SafeRedact uses AI to detect sensitive information in your documents. Here's exactly what that means for your data:

Original files stay local

Original PDF files stay in your browser

Text sent to AI

Extracted text is analyzed by Claude AI

No file storage

We don't store your documents anywhere

Redaction is local

Clean file created in your browser

Our Hybrid Architecture

We use a hybrid approach that balances AI accuracy with privacy protection. Here's the step-by-step process:

1

Text Extraction (Your Browser)

When you upload a document, your browser extracts the text using PDF.js (for PDFs) or Tesseract.js OCR (for scanned documents). The original file stays in your browser memory.

Runs locally in your browser

2

AI Detection (Cloud)

The extracted text (with position coordinates) is sent to our API, which uses Claude AI to classify which items are sensitive: SSNs, names, addresses, phone numbers, etc. This is what enables accurate, context-aware detection.

Text sent to AI (not your original files)

3

Review & Redact (Your Browser)

The AI returns classification results. You review the detections, adjust as needed, and click Redact. The clean output file is generated entirely in your browser—a new PDF without the sensitive data.

Runs locally in your browser

What IS Sent to the Cloud

Extracted text strings only

Source documents (PDF, DOCX, XLSX, etc.) are parsed entirely in your browser. What leaves your device is the extracted text content — the strings the classifier needs to identify PII — over TLS 1.3. The original file binary never leaves the browser; nothing is uploaded.

// What's actually transmitted:

POST /api/detect  (TLS 1.3)
Content-Type: application/json

{
  "textItems": [
    { "id": "t1", "text": "Patient John Smith, DOB 12/4/1972" },
    { "id": "t2", "text": "Re: Account 4521-9907" }
  ]
}

Only the extracted text strings are transmitted. The PDF, DOCX, or other source file binary stays in the browser.

The text is processed by Anthropic's Claude API, which does not store or train on API inputs (zero-retention headers). Data is processed in memory and immediately discarded. See Anthropic's Privacy Policy for details.

Transport Security

TLS 1.3 + Browser Isolation

The strongest claim is what doesn't leave your browser

SafeRedact's transport security rests on two facts: (1) your source document file never leaves your browser, and (2) the extracted text that does leave travels over TLS 1.3 — the same transport encryption used by banks and government services. The structural protection — keeping the document binary out of our infrastructure entirely — eliminates whole categories of breach exposure that no amount of in-flight encryption could address.

Earlier versions of SafeRedact added a custom application-layer AES-256-GCM encryption step on top of TLS, claiming defense-in-depth. On review, that layer transmitted the symmetric key in the same request body as the ciphertext, so it provided no security beyond what TLS already offers. We removed it in May 2026 and updated this whitepaper to describe what the system actually does. Transparency on architecture changes is itself a security commitment.

What protects your data

• Source file binary never leaves the browser
• Only extracted text strings are transmitted
• TLS 1.3 confidentiality & integrity in transit
• Zero-retention LLM API (no training, no logs)
• Origin-locked CORS on every endpoint
• Nothing persists server-side — nothing to breach

Technical details

• Transport: TLS 1.3 (HTTPS)
• Source files: Browser-only, never uploaded
• LLM provider: Anthropic Claude
• LLM retention: Zero (no-train header)
• Server-side storage of content: None

This architecture is what enables SafeRedact to handle data subject access requests for regulated industries — finance, healthcare, legal, government — without expanding the data-handler boundary beyond your own browser.

What Is NOT Sent

Original PDF files

Original PDF binary stays in your browser

Scanned PDFs

Original PDF stays local; only extracted text is sent

Document structure

Formatting, layout, fonts, embedded objects

Metadata

Author, creation date, filename, etc.

Important: We do not store documents on SafeRedact servers. There is no retention period because there is nothing retained. Processing is ephemeral—when you close the tab, it's gone.

How We Compare to Other Tools

Tool	File Upload	Detection	Retention
SafeRedact	Text only	AI (Claude)	None
Adobe Acrobat Pro	Local	Regex patterns	N/A
Smallpdf	Full file	Manual only	1 hour
iLovePDF	Full file	Manual only	2 hours
Redactable	Full file	AI	Account storage

Visual: Where Your Document Goes

Most Tools

Your Computer

Full file

↓

Their Server

Stored & processed

SafeRedact

Your Browser

Text only

↓

AI (Anthropic)

Not stored · Discarded

SafeRedact advantage

AI-powered detection accuracy without uploading your actual documents. Other AI redaction tools require full file uploads and store documents in your account.

Tradeoff

Desktop apps like Adobe are fully local, but use basic pattern matching that misses context-dependent PII like names and addresses.

Compliance Considerations

Important Disclaimer

SafeRedact is not certified for HIPAA, GLBA, FERPA, or other industry-specific regulations. While we minimize data exposure, extracted text is processed via a third-party AI API. Organizations with strict compliance requirements should evaluate whether this meets their policies.

What this means for HIPAA

→ Text from documents is sent to Anthropic's API
→ Anthropic does not store API inputs by default
→ SafeRedact does not have a BAA in place
→ Consult your compliance officer before use with PHI

What this means for GDPR

→ Text processing involves a US-based data processor (Anthropic)
→ No persistent storage of personal data
→ Processing is ephemeral (no retention)
→ Review your DPA requirements

Frequently Asked Questions

Why not process everything locally?

We tried. Browser-based regex detection catches obvious patterns (SSNs, credit cards) but consistently misses context-dependent PII like names and addresses. AI provides dramatically better accuracy, which is the whole point of a redaction tool—you need to catch everything.

Is my data used to train AI models?

No. We use Anthropic's API, which does not use API inputs for model training by default. See Anthropic's Privacy Policy.

Why is this better than Smallpdf or iLovePDF?

Those tools upload your entire PDF to their servers and store it for 1-2 hours. We only send extracted text, and we don't store anything. Plus, they don't have AI detection—you have to manually find and mark every sensitive item.

Can I use SafeRedact offline?

Text extraction and redaction work offline, but AI detection requires an internet connection. Without it, the tool falls back to basic regex patterns (less accurate but still functional for SSNs, credit cards, etc.).

Is the redaction permanent?

Yes. SafeRedact renders your document as a new image with redaction boxes burned in. The sensitive content is never included in the output file—it's not hidden or covered, it simply doesn't exist in the exported document. The redacted content cannot be recovered by copy/paste, Photoshop, or any other method.