Extract citations from a PDF or text

Pull every citation out of a PDF or pasted text — full bibliographic entries from the references section, in-text markers (Smith 2023; [14]) mapped to their source, and any DOIs or URLs. Export as BibTeX, RIS, CSL JSON, or a flat spreadsheet.

Drop a PDF or image here, or click to browse
Max 20 MB per file · PDF, PNG, JPG, WEBP, HEIC
Pro: drop up to 25 files at once for bulk extraction
What to extract from this document?
or describe it yourself
Extracting all citations (full bibliographic)

Why this matters

Bibliography management tools (Zotero, Mendeley, EndNote) extract references from PDFs by pattern-matching on font and layout — they break on legal briefs, government reports, and any document that doesn't follow APA/MLA conventions. ExtractFox reads the page semantically, so it picks up citations regardless of style and reliably maps in-text markers to their full reference.

How it works

  1. Step 1
    Upload the PDF or paste text

    Research papers, legal briefs, reports, or pasted text. Multi-column layouts and footnoted citations both work.

  2. Step 2
    Pick a mode

    All citations, in-text markers only, references section only, or grouped by section — and choose an export format.

  3. Step 3
    Export

    BibTeX (.bib), RIS, CSL JSON, or CSV. Drop straight into Zotero, Mendeley, EndNote, or a spreadsheet.

Sample output

Example: 3 of the references from a research paper PDF, exported as CSL JSON

idtypetitleauthorissuedcontainer-titlevolumeDOI
vaswani2017attentionpaper-conferenceAttention Is All You Need[{"family":"Vaswani","given":"Ashish"},{"family":"Shazeer","given":"Noam"},{"family":"Parmar","given":"Niki"}]{"date-parts":[[2017]]}Advances in Neural Information Processing Systems30
devlin2019bertpaper-conferenceBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[{"family":"Devlin","given":"Jacob"}]{"date-parts":[[2019]]}NAACL-HLT10.18653/v1/N19-1423

Frequently asked questions

How do I extract citations from a PDF?+

Upload the PDF, pick a mode (all citations, in-text mapped, BibTeX, CSL JSON, or legal citations), and export. The result drops straight into Zotero, Mendeley, EndNote, or any reference manager that accepts BibTeX or RIS.

How is this different from Zotero's PDF metadata extraction?+

Zotero's extractor reads metadata stored in the PDF and pattern-matches the references section. It works well on standard APA/MLA papers and breaks on most legal briefs, government reports, and theses. ExtractFox reads the page semantically — it works regardless of citation style.

Will it map in-text markers to their references?+

Yes — pick the In-text markers mapped mode. Each marker comes back with the sentence it appeared in, the page, and the full reference it points to.

Can I get just the legal citations?+

Yes. The Legal citations mode pulls only case, statute, regulation, and treaty citations and returns them in Bluebook-style fields with reporter, volume, page, year, and court.

What about footnoted citations?+

Handled the same way. Citations in footnotes are tagged with the page they appeared on; the full bibliographic version still ends up in the bibliography list.

What citation styles does it recognize?+

APA, MLA, Chicago, IEEE, Vancouver, Bluebook, ACS, AMA, and most journal-specific variants. The mode detects the style and parses fields accordingly.

Related extractors

Compared to alternatives