EngineeringMay 6, 20265 min read

How to remove metadata from a PDF (for privacy)

Author, software, GPS, edit history — every PDF leaks more than you think. The reliable ways to strip metadata before sharing, in any tool you already have.

By Dawid Sibinski

Every PDF you share carries metadata you probably didn't mean to send. Author name from the OS user account. Producer software version. Creation and modification timestamps. Sometimes XMP fields with custom data, GPS from the phone that scanned the page, or revision history from the editor. Before you send a PDF outside your team — to a journalist, a regulator, a public records request, a court — you almost always want to strip it.

What's actually in there

PDF metadata lives in two places. The document information dictionary holds the classic fields: Title, Author, Subject, Keywords, Creator, Producer, CreationDate, ModDate. The XMP packet (modern, XML-based) can hold the same fields plus arbitrary custom schemas — anything from camera EXIF to internal tracking IDs. A complete metadata-removal tool has to clear both.

Beyond metadata, there are other leak surfaces worth knowing about: the PDF's internal object stream may retain text from earlier revisions, embedded fonts can carry the original file path, and image XObjects often carry full EXIF including GPS. Stripping only the document-level metadata leaves all of those in place.

The four reliable methods

Use Adobe Acrobat's 'Sanitize Document' (Pro only) for a one-click belt-and-braces remove of everything: metadata, hidden text, embedded files, comments. This is the gold standard if you have it.
Use exiftool from the command line for batch jobs. One command per file or per folder, runs on Linux/Mac/Windows, deterministic.
Use qpdf to re-write the PDF without its metadata streams. Useful when you also want to linearize or compress the file.
Print to PDF as a last resort. The 'Save as PDF' or 'Microsoft Print to PDF' workflow drops most metadata and any document history, at the cost of converting all selectable text to a flat re-rendered page.

exiftool: the most useful one to know

exiftool is a Perl script with no real dependencies that ships in every package manager. It reads and writes metadata across hundreds of file formats — including PDF.

# Strip every metadata field from a single PDF exiftool -all= -overwrite_original input.pdf # Strip metadata from every PDF in a folder exiftool -all= -overwrite_original -ext pdf . # Inspect what's there before you strip exiftool input.pdf

The -all= flag clears every metadata tag exiftool can write. -overwrite_original avoids leaving a .pdf_original backup file, which is what you want for a real privacy pass — those backups are how metadata leaks back into a workflow.

qpdf: when you also want to clean up the file

qpdf is a structural PDF tool. It can re-write a file without its metadata streams and clean up unused objects in one pass:

qpdf --linearize --object-streams=disable \ --remove-unreferenced-resources=yes \ input.pdf output.pdf # Then strip metadata exiftool -all= -overwrite_original output.pdf

The combination of qpdf re-writing and exiftool stripping is the most thorough non-Acrobat approach. It removes orphaned objects (which sometimes contain old metadata or embedded files), clears the metadata streams, and produces a smaller file.

What none of these catch

Metadata is not the same thing as redaction. If your PDF contains a black rectangle drawn over text, the text is still in the file — exiftool will not remove it. If your PDF was scanned and contains an OCR layer, the OCR'd text is also in the file. Real redaction requires either Acrobat's redaction tool or rasterizing the affected pages and re-OCR'ing them after stripping.

Visible-but-recoverable text under black boxes — needs redaction, not metadata removal.
Embedded fonts with the original file path — qpdf re-write handles this, exiftool alone does not.
Images inside the PDF carrying their own EXIF and GPS — strip the images separately, or rasterize and re-OCR the page.
Comments and annotations — Acrobat's Sanitize handles these; exiftool does not touch them.
Form-field history — same: Acrobat or a re-render handles it.

Verifying the result

Always re-inspect after stripping. The fastest check is to run exiftool again and confirm the only fields that come back are the ones the PDF library re-wrote (typically PDF Version, Page Count, Linearized). If anything else is still there, the strip didn't work.

For the inverse problem — pulling metadata out of a PDF to read what's there before you publish — see the companion post on extracting PDF metadata, or just drop a file into ExtractFox.

Tool

Inspect a PDF's metadata and contents in one drop

Need to see what a PDF actually carries before you redact or share it? Drop the file into ExtractFox and get back a structured view of fields, text, tables, and document-level metadata.

Frequently asked questions

What metadata does a PDF contain?+

A PDF can carry author name, software used to create it (producer and creator fields), creation and modification timestamps, XMP metadata with keywords or descriptions, and sometimes embedded thumbnails or revision history. Scanned PDFs may also contain GPS coordinates from the scanning device.

How do I remove metadata from a PDF on a Mac?+

Open the PDF in Preview, export it as PDF (File → Export as PDF). Preview's export strips most metadata. For more thorough removal, use ExifTool (exiftool -all= document.pdf) or Ghostscript (gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=clean.pdf input.pdf).

How do I remove metadata from a PDF on Windows?+

Use Adobe Acrobat's Document Properties → Description → Clear, or print to PDF via Microsoft Print to PDF (File → Print → Microsoft Print to PDF). ExifTool for Windows does the same as on Mac: exiftool -all= document.pdf.

Does removing PDF metadata affect the visible content?+

No. Stripping metadata only removes hidden fields like author, software, and timestamps. All visible text, images, and formatting are untouched.

How do I check what metadata a PDF contains before removing it?+

Run exiftool document.pdf to see every metadata field. On Mac, File → Get Info in Preview shows a summary. Adobe Acrobat → File → Properties → Description shows the main document properties.