How to Remove Metadata from PDF: Complete Guide 2026
Every PDF you share contains hidden data - your name, your software, your timestamps, sometimes even your GPS coordinates. Here's what's actually in there and how to remove it.
What Is PDF Metadata (and Why Should You Care)?
PDF metadata is invisible information embedded in every PDF file. When you create a document in Word, Google Docs, Adobe Acrobat, or any other tool, metadata is automatically saved alongside your visible content. This includes your name, the software you used, when you created and last edited the file, and sometimes much more.
Most people never think about it. But every time you email a contract, share a proposal, upload a report to a client portal, or post a PDF on your website, that hidden data goes with it. Anyone with a basic PDF reader can view it.
Under privacy regulations like GDPR, CCPA, and 20+ US state privacy laws, sharing documents with exposed personal data - including metadata - can create real compliance risk. It's not hypothetical: improperly sanitized documents have caused data leaks in court filings, government agencies, and corporate disclosures.
What Hidden Data Do PDFs Actually Contain?
There are two layers of hidden data in a PDF. Most tools only scan the first layer. Here's what's really in there:
Standard Metadata (the basics)
Author - your full name, often pulled from your OS user account or software license. This is the most common leak.
Creator & Producer - the software used to create and convert the PDF (e.g., "Adobe Acrobat Pro 23.8.20470" or "Microsoft Word 16.0"). Reveals your toolchain and version numbers.
Creation Date & Modification Date - exact timestamps showing when the document was first created and last edited. Can reveal work patterns and timelines.
Title, Subject, Keywords - often auto-populated from the document content. May expose internal project names or classification labels.
Deep Metadata (what most tools miss)
XMP Metadata Streams - an XML-based metadata block embedded in the PDF that can contain 15+ additional fields including document IDs, instance IDs, authoring positions, and cross-references to other files. Even if you clear the standard metadata, XMP often retains the original values.
Embedded Fonts - the font names and types used in the PDF reveal your operating system and installed software. A document with "Calibri" and "Segoe UI" signals Windows; "SF Pro" signals macOS.
Hidden Links & Tracking URLs - PDFs can contain clickable URLs that include tracking parameters, shortened links (bit.ly, tinyurl), or analytics beacons that notify the sender when you open the file.
GPS Coordinates from Embedded Images - if your PDF contains photos (like a scanned document or an embedded screenshot taken on a phone), those images may contain EXIF data with exact GPS coordinates. This is a critical privacy risk that most people never consider.
Embedded JavaScript - PDFs can contain executable JavaScript that runs when the file is opened. This can be used for tracking, data exfiltration, or malicious payloads.
Unflattened Redactions - this is one of the most dangerous PDF metadata issues. A redaction that hasn't been properly "flattened" looks blacked out visually, but the original text underneath can still be copied, selected, or extracted. Courtrooms and government agencies have leaked sensitive information this way.
Document History (Incremental Saves) - PDFs use incremental saves, meaning previous versions of the document can be recoverable from the file. Each save appends new data without fully removing the old content. A PDF with 5 incremental saves may contain 4 earlier versions that can be forensically extracted.
Strip Hidden Data from Your PDFs Now
Drop a PDF, see every hidden field, strip it in one click. 100% in-browser, no upload. Free, no signup.
Try PDF Metadata Stripper →5 Ways to Remove PDF Metadata (Compared)
1. TrustScan PDF Metadata Stripper (Free, Client-Side)
Drop your PDF into TrustScan's PDF Metadata Stripper. It instantly scans both standard and deep metadata - XMP streams, fonts, links, GPS data, JavaScript, redactions, and document history. You see exactly what's hidden, each field tagged by risk level (critical, high, medium, low). Strip everything in one click and download the clean file.
Key advantage: 100% client-side. Your file never leaves your browser. No upload, no server processing, no account. Supports batch processing with ZIP download. This is the only free tool that combines standard metadata stripping with deep scanning for XMP, EXIF, JavaScript, redactions, and document history.
2. Adobe Acrobat Pro (Paid)
Adobe Acrobat Pro has a "Remove Hidden Information" and "Sanitize Document" feature under the Protection panel. It removes metadata, hidden layers, attached files, scripts, and can flatten redactions. It's the most thorough desktop solution.
Downside: Requires an Acrobat Pro subscription ($23/month). Not available in the free Acrobat Reader. The sanitize workflow is buried in menus and easy to miss.
3. ExifTool (Free, Command Line)
ExifTool is a powerful command-line utility that can read and remove metadata from PDFs and many other file formats. Running exiftool -all= document.pdf strips most metadata fields.
Downside: Command-line only - not accessible for non-technical users. Doesn't detect unflattened redactions, embedded JavaScript, or document history. No visual preview of what's being removed.
4. Print to New PDF (Free, Any OS)
Open the PDF, use File → Print → "Save as PDF" or "Microsoft Print to PDF." This creates a fresh PDF with only the rendered visual content, stripping most metadata.
Downside: Loses bookmarks, hyperlinks, form fields, and accessibility tags. May not strip XMP streams or embedded fonts completely. Doesn't remove GPS data from embedded images - the image is re-embedded as-is. Not practical for batch processing.
5. Online Upload Tools (Free, Privacy Risk)
Tools like PDF24, PDFYeah, and GroupDocs let you upload a PDF and download a cleaned version. They work, but your file is sent to and processed on their servers.
Downside: Your file is uploaded to a third-party server - which defeats the purpose of removing metadata for privacy. Most only strip standard metadata and don't scan for XMP, GPS, JavaScript, redactions, or document history. Some have usage limits or require accounts.
Quick Comparison
| Method | Cost | Client-Side | Deep Scan | Batch |
|---|---|---|---|---|
| TrustScan | Free | ✓ | ✓ | ✓ |
| Adobe Acrobat Pro | $23/mo | ✓ | ✓ | ✓ |
| ExifTool | Free | ✓ | Partial | ✓ |
| Print to PDF | Free | ✓ | ✗ | ✗ |
| Online Upload Tools | Free* | ✗ | ✗ | Limited |
Real-World Risks of Unstripped PDF Metadata
Court filing leaks - Lawyers have accidentally exposed confidential client names, internal notes, and privileged communications through PDF metadata in court filings. Unflattened redactions are a recurring issue in legal discovery.
Government document exposure - Government agencies have published "redacted" reports where classified information was recoverable by simply copy-pasting from behind the black bars. This happens because the redaction overlay was never flattened into the document.
Corporate identity leaks - Sending a proposal or contract with your full name, company software stack, and edit history in the metadata gives the recipient information you never intended to share. Competitors can learn what tools you use, when you work, and how many revisions you went through.
Compliance violations - Under GDPR, personal data includes any information that can identify a person - including author names in document metadata. Sharing PDFs with exposed metadata to the wrong audience can constitute a data protection violation, especially in regulated industries like healthcare and finance.
PDF Metadata + Privacy Law = Real Fines
Sharing documents with exposed author names and timestamps can violate GDPR, CCPA, and 20+ other privacy laws. Check which apply to you.
Check My Compliance →Best Practices for PDF Metadata Hygiene
1. Strip before sharing - make it a habit to run every PDF through a metadata stripper before sending it externally. Add it to your document review workflow.
2. Check redactions are flattened - if you redact content, always verify the redaction is flattened (permanently applied), not just an overlay annotation. TrustScan's deep scan detects this automatically.
3. Watch embedded images - photos from phones contain GPS, camera model, and timestamp data. Strip image EXIF data before embedding in PDFs, or use TrustScan to detect it after the fact.
4. Audit your templates - if your organization uses PDF templates, check them for baked-in metadata. A template with a former employee's name in the Author field will propagate that data to every document generated from it.
5. Use client-side tools - if you're removing metadata for privacy reasons, uploading the file to a server-based tool creates a new privacy risk. Choose tools that process entirely in your browser.
Strip Hidden Data from Your PDFs Now
Drop a PDF, see every hidden field, strip it in one click. 100% in-browser, no upload. Free, no signup.
Try PDF Metadata Stripper →Cybersecurity professionals building free privacy tools for the 2026 compliance landscape.