All posts
WorkflowMay 1, 20266 min read

How to extract data from a website to Excel automatically

Power Query, Make/Zapier, and a no-code AI route — three ways to set up an automated pipeline from a website to an Excel file that updates on its own.

By Dawid Sibinski

Manual web-to-Excel is fine once. Manual web-to-Excel every Monday is a different problem. The right automation depends on whether the source has a stable HTML table, an API, or neither.

1. Power Query (built into Excel)

If the page has an HTML table or a JSON endpoint, Power Query is the right tool and it's already installed. Data → Get Data → From Web → paste the URL. Power Query inspects the page, lists every table it can find, and lets you preview before loading.

Once loaded, the query lives in the workbook. Right-click → Refresh on the table to re-pull. Schedule the workbook to refresh on open, or wire it to a Power Automate flow for hourly/daily refreshes.

Limits: Power Query doesn't run JavaScript, so single-page apps that render data client-side are invisible to it. It also doesn't authenticate beyond basic and OAuth — sites that require complex login flows won't work.

2. Make, Zapier, or n8n + Google Sheets / Excel

For sites that don't play nice with Power Query, the common pattern is: a scraping service (Apify, ScrapingBee, Browserless) feeds rows into Make/Zapier, which writes them to a Google Sheet or Excel Online. Total monthly cost is usually $20–$60 depending on volume.

This works well for ecommerce price tracking, job board monitoring, and listing aggregation — anything where a structured page changes regularly and you need updates in a sheet.

3. AI-based extraction on a schedule

The newer pattern: a scheduled job hits the page, screenshots it (or grabs the rendered HTML), runs it through a multimodal extraction API, and appends rows to a sheet. Wins when the page layout changes frequently or you want to extract from many pages with slightly different structures.

ExtractFox's website extractor handles this manually today; for full automation, hit the API on a Vercel cron or GitHub Action and write to Excel via the Microsoft Graph API. A few dozen lines of code.

Choosing

  • Stable HTML table → Power Query. Free, no extra services.
  • Dynamic JS-heavy site, recurring schedule → Apify + Make → Google Sheets.
  • Variable layouts, multiple sources → AI extraction on a cron.

What to watch for

Whatever route you pick: respect robots.txt, throttle requests, and check the site's terms of service. Automated scraping at scale draws attention even when the site doesn't block you on day one.

More on workflow

Stop reading, start extracting

Drop a PDF or image into ExtractFox and get structured data back in seconds.

Try a free extraction →