How to extract zip and postal codes from addresses (Excel, Sheets, Python)
Three reliable patterns for pulling postal codes out of free-text addresses — Excel formulas for clean US data, regex for international, libpostal when nothing else works.
Pulling a US zip code out of a clean address is a one-line formula. Pulling postal codes out of international addresses where the format varies by country is a different problem altogether.
Excel: US zip codes (clean addresses)
If your addresses end in a 5-digit zip (or zip+4):
=RIGHT(A1, 5) // last 5 chars =RIGHT(A1, 10) // last 10 chars (catches zip+4) =TEXTAFTER(A1, " ", -1) // last whitespace-delimited token (Excel 365)
TEXTAFTER with -1 is the cleanest in modern Excel — it returns whatever follows the last space, which is almost always the postal code in US format.
Excel: regex for messier data
Excel 365 added REGEXEXTRACT in 2024:
=REGEXEXTRACT(A1, "\b\d{5}(?:-\d{4})?\b")
Catches "94103" or "94103-1234" anywhere in the cell. For UK postcodes:
=REGEXEXTRACT(A1, "\b[A-Z]{1,2}\d{1,2}[A-Z]?\s*\d[A-Z]{2}\b")
Google Sheets
REGEXEXTRACT has been in Sheets for years with the same syntax. For US zip:
=REGEXEXTRACT(A1, "\d{5}(-\d{4})?")
Python: regex by country
import re US_ZIP = re.compile(r"\b\d{5}(?:-\d{4})?\b") UK_PC = re.compile(r"\b[A-Z]{1,2}\d{1,2}[A-Z]?\s*\d[A-Z]{2}\b", re.I) CA_PC = re.compile(r"\b[A-Z]\d[A-Z]\s*\d[A-Z]\d\b", re.I) def extract_postal(address: str, country: str = "US"): p = {"US": US_ZIP, "UK": UK_PC, "CA": CA_PC}[country] m = p.search(address) return m.group(0) if m else None
Country-specific regex is fast and reliable when you know the country up front. Falls apart when you don't.
Multi-country: libpostal
If country isn't known and addresses are international:
from postal.parser import parse_address parse_address("221B Baker Street, London NW1 6XE, UK") # [('221b', 'house_number'), ('baker street', 'road'), # ('london', 'city'), ('nw1 6xe', 'postcode'), ('uk', 'country')]
libpostal handles 200+ countries' postal formats. Trained on OpenStreetMap. Heavyweight install (compiles C) but worth it for any serious international address work.
Common pitfalls
- Excel strips leading zeros — "03062" becomes "3062" if the column is formatted as Number. Format as Text first.
- 9-digit zips written without hyphens look like phone-extension prefixes. The regex \d{5}(?:-?\d{4})? handles both forms.
- Some countries have no postal codes (Ireland used to, mostly). Your extractor should return null, not crash.