Extract code from a screenshot, video, or PDF

Q: How do I extract code from a screenshot?

Upload the screenshot here, pick a mode (clean code, with line numbers, all blocks, etc.), and click Extract. Output is ready to paste into your editor with indentation preserved.

Q: Does it detect the language?

Yes — every mode returns a detected language so you can save the file with the right extension. Recognized languages include Python, JavaScript, TypeScript, Go, Rust, Java, C++, Ruby, PHP, SQL, shell, and dozens more.

Q: Can I extract code from a video tutorial?

Take a screenshot of the frame with the code, then upload it. For longer tutorials with code that scrolls, screenshot each state and use the All code blocks mode to consolidate.

Q: What about REPL or Jupyter notebook screenshots?

Pick the Code and its output mode. Each input cell and its corresponding output come back separately so you can re-run the code without the output mixed in.

Q: Will indentation be exact?

Yes. Tabs vs spaces is preserved, indentation depth is preserved. Code that runs in the original screenshot will run when you paste the extraction.

Q: How is this different from generic OCR like Tesseract?

Tesseract treats code as text and routinely loses indentation, confuses similar characters (l/1/i, O/0), and breaks on monospaced fonts with thin strokes. The multimodal model reads code semantically — it knows what valid syntax looks like and produces output that compiles.

Q: Can it recover code from a photo of a whiteboard or projector screen?

Yes. Whiteboard and projector photos with moderate glare and angle work — indentation and brackets are preserved. Very faint marker or extreme distance may need a closer photo.

Q: Does it handle syntax-highlighted code with colored backgrounds?

Yes. Dark themes, light themes, and rainbow syntax highlighting are all fine — the model reads token structure, not font color.

Drop a screenshot, video frame, slide, or PDF that contains code — get clean source back with indentation preserved, language detected, and ready to paste straight into your editor. Skips the OCR mess of Tesseract on monospaced fonts and the manual cleanup that follows.

Drop a PDF or image here, or browse

PDF or image · up to 20 MB

Processed in-flight — never stored on our servers.

What should we pull from this code source?

Or pick specific fields

Or describe it yourself

Why this matters

Generic OCR mangles code. Spaces and tabs blur, brackets land on the wrong line, single-character tokens (i, l, 1, |) get confused. ExtractFox uses a multimodal model that reads source code semantically — it knows what a function declaration looks like, where indentation matters, and what to do with a string that contains escape characters. Conference slides put 15 lines of Rust on a dark background and Tesseract reads 0 as O and collapses four-space indents into one — the pasted code doesn't compile and you spend 10 minutes fixing it. Paywalled tutorials and copy-protected docs block select-all, so the only option is a screenshot and a prayer that OCR gets the closing braces right.

How it works

Step 1
Upload the source
A screenshot, slide image, video frame, conference talk PDF, or a copy-protected web page screenshot. Multiple files at once.
Step 2
Pick a mode
Code only (clean), code with line numbers preserved, code with comments, or all code blocks from a multi-snippet image.
Step 3
Copy or download
Pick Clean code and the result is the code by itself — copy it straight into your editor or download it as .txt. The multi-block modes come back as a table (one row per block) you can take as Excel, CSV, or JSON.

Common use cases

Tutorial recovery — paste working code from a paywalled article screenshot

Slide decks — extract every code example from a conference talk PDF

Error debugging — pull the snippet from a stack-trace screenshot with line numbers intact

Notebook migration — separate input cells from output in Jupyter screenshots

Sample output

Example: a screenshot of a Python function from a tutorial

language	python
code	def fibonacci(n): if n <= 1: return n a, b = 0, 1 for _ in range(n - 1): a, b = b, a + b return b # Print the first 10 Fibonacci numbers for i in range(10): print(fibonacci(i))

Frequently asked questions

How do I extract code from a screenshot?+

Upload the screenshot here, pick a mode (clean code, with line numbers, all blocks, etc.), and click Extract. Output is ready to paste into your editor with indentation preserved.

Does it detect the language?+

Yes — every mode returns a detected language so you can save the file with the right extension. Recognized languages include Python, JavaScript, TypeScript, Go, Rust, Java, C++, Ruby, PHP, SQL, shell, and dozens more.

Can I extract code from a video tutorial?+

Take a screenshot of the frame with the code, then upload it. For longer tutorials with code that scrolls, screenshot each state and use the All code blocks mode to consolidate.

What about REPL or Jupyter notebook screenshots?+

Pick the Code and its output mode. Each input cell and its corresponding output come back separately so you can re-run the code without the output mixed in.

Will indentation be exact?+

Yes. Tabs vs spaces is preserved, indentation depth is preserved. Code that runs in the original screenshot will run when you paste the extraction.

How is this different from generic OCR like Tesseract?+

Tesseract treats code as text and routinely loses indentation, confuses similar characters (l/1/i, O/0), and breaks on monospaced fonts with thin strokes. The multimodal model reads code semantically — it knows what valid syntax looks like and produces output that compiles.

Can it recover code from a photo of a whiteboard or projector screen?+

Yes. Whiteboard and projector photos with moderate glare and angle work — indentation and brackets are preserved. Very faint marker or extreme distance may need a closer photo.

Does it handle syntax-highlighted code with colored backgrounds?+

Yes. Dark themes, light themes, and rainbow syntax highlighting are all fine — the model reads token structure, not font color.

Related extractors

Extract text from any image

Image to text converter for photos, screenshots, and scans — plain text output, not structured fields. Handwriting and glare handled. Need tables or JSON? Use image data extraction instead.

Extract text from any PDF

Free PDF-to-text converter: pull clean text from any PDF — digital, scanned, or image-based. One tool for both, no quality compromise. No signup.

Extract data from images

Image data extraction: pull structured fields from photos and screenshots — receipts, tables, IDs, charts — into Excel or JSON. Not plain OCR. Free, no signup.

From the blog

TutorialApril 2, 2026

How to extract code from a video tutorial

Extract source code from a programming tutorial video with clean screenshots, a code-aware extractor, or an FFmpeg frame pipeline for longer recordings.