How to extract images from a Word document
The fastest way to pull every image out of a .docx file at original resolution — using nothing more than the file extension trick that works in any unzip tool.
A .docx file is secretly a ZIP archive. Inside it, every image you've inserted into the document lives in a media folder at original resolution. You don't need any special software to get them out.
The .zip rename trick
- Make a copy of report.docx so you don't accidentally corrupt the original.
- Rename the copy to report.zip.
- Unzip it.
- Open the word/media/ folder. Every embedded image is there as image1.png, image2.jpeg, etc., at original quality.
Works on Windows, macOS, and Linux without any extra tools. Same trick works on .pptx (ppt/media/) and .xlsx (xl/media/).
From inside Word
Right-click any image → Save as Picture. Right format options, sometimes downsampled depending on Word's image compression settings. Fine for one or two; tedious for many.
Word's File → Save As → Web Page also dumps every image to a folder, but applies its own compression. The .zip rename trick gives you the original bytes; this gives you a re-encoded version.
Programmatic: python-docx
from docx import Document doc = Document("report.docx") for i, rel in enumerate(doc.part.related_parts.values()): if "image" in rel.content_type: with open(f"img_{i}.{rel.content_type.split('/')[-1]}", "wb") as f: f.write(rel.blob)
Useful when you're processing hundreds of documents and want to drop the images straight into a database or DAM.
The .doc (legacy) edge case
Old .doc files (Word 97–2003 binary format) aren't ZIP archives. The .zip trick fails. Either open in Word and save-as .docx first, or use Apache Tika or LibreOffice's headless mode to convert. Once it's .docx, the standard methods work.