Extract links from a PDF

Drop a PDF and get every link out — clickable annotations, footnote URLs, and visible plain-text URLs. Returns a flat list with the page each link came from, ready to copy into a spreadsheet or feed into a checker.

Drop a PDF or image here, or click to browse
Max 20 MB per file · PDF, PNG, JPG, WEBP, HEIC
Pro: drop up to 25 files at once for bulk extraction
What to extract from this pdf?
or describe it yourself
Extracting all links

Why this matters

Most PDF tools either dump the whole text (and you regex for URLs) or only see the clickable annotations (and miss the plain-text ones). ExtractFox reads the PDF the way a person does, so visible URLs and clickable links both come back, deduped, with the page number where each one appeared.

How it works

  1. Step 1
    Upload the PDF

    Digital, scanned, or image-based PDFs all work. Up to 20 MB.

  2. Step 2
    Pick what to pull

    All links (default), only external URLs, only email addresses, only links inside footnotes, or links per page.

  3. Step 3
    Export

    Copy the list, or download as CSV / JSON. Each row carries the URL, the link text (if any), and the page number.

Sample output

Example: links from a 4-page research paper PDF

links
urllink_textpage_numbertype
https://arxiv.org/abs/2401.12345arxiv.org/abs/2401.123451plaintext
https://github.com/acme/researchgithub.com/acme/research1clickable
mailto:author@university.eduauthor@university.edu1clickable
https://doi.org/10.1145/3456789.3456790[14]4clickable
https://example.org/datasetexample.org/dataset4plaintext

Frequently asked questions

How do I extract links from a PDF?+

Upload the PDF here, pick a mode (all links, external only, emails, footnote links, or grouped by page), and click Extract. Download as CSV or JSON.

Does it find plain-text URLs, or only clickable ones?+

Both. Most PDF link tools only read the clickable annotations baked into the file. ExtractFox reads the visible text too, so URLs that someone typed in but never made clickable still come through.

Will it work on a scanned PDF?+

Yes. The model reads the page visually, so URLs in scanned documents are extracted the same way as URLs in digital PDFs. OCR happens automatically.

Can I get just the email addresses?+

Yes — pick the Email addresses only mode. It returns every email from both mailto: links and plain-text addresses, with the surrounding context.

How is this different from the PDF text extractor?+

The text extractor returns the document's full text. This one returns just the URLs (and their page numbers and link text), already deduped and structured.

What's the largest PDF this handles?+

20 MB per upload. For larger PDFs, split first or use the API which supports streaming larger files.

Related extractors

Compared to alternatives