TutorialApril 14, 20264 min read

Extract embedded files and attachments from a PDF

How to extract embedded files and attachments from a PDF with Acrobat, pdfdetach, qpdf, and Python. Works for Excel sheets, source data, and supporting documents.

By Dawid Sibinski · Updated June 8, 2026

To extract embedded files from a PDF, first check Acrobat's Attachments pane. For batch work, use pdfdetach -saveall from Poppler. The attachments are stored inside the PDF file itself, so copying visible text or printing to PDF will not recover them.

PDFs have a feature most readers hide: embedded file attachments. A research paper can ship with the dataset attached, an annual report with the source spreadsheet, an invoice with a packing slip. The attachments are in the file but invisible until you go looking.

Acrobat / Adobe Reader

Open the PDF, click the paperclip icon in the left rail (View → Show/Hide → Navigation Panes → Attachments if it's hidden). Each attachment shows its name and size. Right-click → Save Attachment.

Preview on macOS

Preview doesn't show embedded attachments. The file is still there — you just can't see it. Either open in Acrobat, or extract via the command-line tools below.

Command line: pdfdetach

Comes with poppler-utils (brew install poppler on Mac, apt install poppler-utils on Linux):

pdfdetach -list report.pdf # list attachments pdfdetach -saveall -o out/ report.pdf # extract every attachment to out/

This is the right tool for batch — process a folder of PDFs and dump every attachment in one command.

Python: pypdf or pikepdf

pikepdf is the cleaner API for attachments:

import pikepdf with pikepdf.open("report.pdf") as pdf: for name, attachment in pdf.attachments.items(): with open(name, "wb") as f: f.write(attachment.read_bytes())

Don't confuse attachments with these

Embedded images — image objects rendered on the page, not separate files. See the post on extracting images from a PDF.
Embedded fonts — for rendering, not for extraction. Stripped or copied via specific font tools, not pdfdetach.
Form attachments inside PDF forms — sometimes accessible via the Attachments panel, sometimes only via the form's submit-data interface.

When the data you want is in the PDF body, not the attachment

If the PDF doesn't have attachments and the data you need is in the visible content (tables, forms, text), the attachment route is a dead end. ExtractFox's PDF data extractor handles the visible content — pair it with pdfdetach if you also need the bundled files.