Extract data from filled forms
Drop a filled form — handwritten, printed, or mixed — and pull every field's label and value as structured rows. Works on intake forms, applications, surveys, and most government paperwork.
Why this matters
Form data extraction has been the worst part of document automation: traditional OCR struggles on handwriting, and template-based parsers break the moment a form changes. A vision LLM reads the form like a person — locating each field, reading the value (printed or handwritten), and pairing them up — without per-form configuration.
How it works
- Step 1Upload the filled form
Photo, scan, or PDF. Handwritten and printed entries both work.
- Step 2Get label-value pairs
Each field returns as { label, value }. Checkboxes return as { label, checked }. Signatures are flagged but not transcribed.
- Step 3Export to your CRM, ERP, or Excel
JSON for system intake, .xlsx for a quick review sheet.
Common use cases
Sample output
Example: handwritten patient intake form
| form_name | New Patient Intake — Family Medicine |
| label | value | checked |
|---|---|---|
| Full name | Jane Doe | — |
| Date of birth | 1990-04-12 | — |
| Phone | +1 415-555-0182 | — |
| Reason for visit | Persistent cough, 2 weeks | — |
| Allergies | Penicillin | — |
| Currently taking medication | — | true |
| Pregnant | — | false |
| Signature present | yes | — |
Frequently asked questions
How do I extract data from a filled form?+
Upload the form (photo, scan, or PDF), and ExtractFox returns each field as a label-value pair. Checkboxes come back as booleans, dates in ISO format. Download as JSON or Excel.
Does it read handwriting on forms?+
Yes. The vision model reads printed handwriting in English and most Latin-script languages reliably. Cursive and faded handwriting are harder — review the output before relying on it for critical fields.
What about checkboxes, radio buttons, and signatures?+
Checkboxes and radio buttons return as { label, checked: true|false }. Signatures are flagged as present but not transcribed (signature verification is out of scope).
Can I extract fields from a stack of the same form?+
Yes. On the paid plan, define the schema once (or describe it in plain English) and POST every form to the API. Each response uses the same field names — perfect for batch loading into a database.
How accurate is handwritten form extraction?+
On clearly-written, single-language forms, accuracy on printed handwriting is typically 90%+. Heavily-cursive or low-contrast forms drop into the 70–80% range and benefit from human review.
Will it work for government forms like W-9, I-9, or passport applications?+
Yes — these forms are well-structured and extract reliably. For high-stakes regulated workflows (KYC, immigration), pair extraction with human review.