Extract data from filled forms

Drop a filled form — handwritten, printed, or mixed — and pull every field's label and value as structured rows. Works on intake forms, applications, surveys, and most government paperwork.

Drop a PDF or image here, or browse

PDF or image · up to 20 MB

Processed in-flight — never stored on our servers.

What should we pull from this form?

Or pick specific fields

Or describe it yourself

Why this matters

Form data extraction has been the worst part of document automation: traditional OCR struggles on handwriting, and template-based parsers break the moment a form changes. A vision LLM reads the form like a person — locating each field, reading the value (printed or handwritten), and pairing them up — without per-form configuration. Clinical sites receive faxed intake forms where checkboxes are half-filled and medication names are abbreviated in handwriting — template OCR maps 'Metformin 500mg' to the allergies field because the layout shifted one row. HR gets 200 identical job applications with handwritten salary expectations in the margin that no parser was trained to find.

How it works

Step 1
Upload the filled form
Photo, scan, or PDF. Handwritten and printed entries both work.
Step 2
Get label-value pairs
Each field returns as { label, value }. Checkboxes return as { label, checked }. Signatures are flagged but not transcribed.
Step 3
Export to your CRM, ERP, or Excel
JSON for system intake, .xlsx for a quick review sheet.

Common use cases

Patient intake forms — demographics, history, consent

Loan and credit applications — applicant fields, income, collateral

Insurance claim forms — claimant, incident, coverage

Survey responses — multiple-choice and free-text answers

Government forms — visa, tax, permit applications

School registration forms — student, parent, emergency contact

Inspection reports — checklist items with pass/fail and notes

Field service reports — technician checklist items with pass/fail and handwritten notes

Permit applications — zoning and building forms with mixed printed and handwritten sections

Sample output

Example: handwritten patient intake form

form_name

New Patient Intake — Family Medicine

fields

label	value	checked
Full name	Jane Doe	—
Date of birth	1990-04-12	—
Phone	+1 415-555-0182	—
Reason for visit	Persistent cough, 2 weeks	—
Allergies	Penicillin	—
Currently taking medication	—	true
Pregnant	—	false
Signature present	yes	—

Frequently asked questions

How do I extract data from a filled form?+

Upload the form (photo, scan, or PDF), and ExtractFox returns each field as a label-value pair. Checkboxes come back as booleans, dates in ISO format. Download as JSON or Excel.

Does it read handwriting on forms?+

Yes. The vision model reads printed handwriting in English and most Latin-script languages reliably. Cursive and faded handwriting are harder — review the output before relying on it for critical fields.

What about checkboxes, radio buttons, and signatures?+

Checkboxes and radio buttons return as { label, checked: true|false }. Signatures are flagged as present but not transcribed (signature verification is out of scope).

Can I extract fields from a stack of the same form?+

Yes. On the paid plan, define the schema once (or describe it in plain English) and POST every form to the API. Each response uses the same field names — perfect for batch loading into a database.

How accurate is handwritten form extraction?+

On clearly-written, single-language forms, accuracy on printed handwriting is typically 90%+. Heavily-cursive or low-contrast forms drop into the 70–80% range and benefit from human review.

Will it work for government forms like W-9, I-9, or passport applications?+

Yes — these forms are well-structured and extract reliably. For high-stakes regulated workflows (KYC, immigration), pair extraction with human review.

Can it handle multi-page forms where fields continue on page 2?+

Yes. Upload the full multi-page PDF and every field across all pages returns as label-value pairs with a page_number. Fields that span pages are captured as a single entry.

What if the same label appears twice with different values — e.g., two phone numbers?+

Duplicate labels return as separate rows with the same label text and different values, in document order. Disambiguate in your import by position or page number.