This tutorial shows how to scrape Pagesjaunes emails into a reviewable CSV with the Pagesjaunes Emails Scraper for UScraper. You will prepare input URLs, import the workflow, set the export path, validate rows, and handle common blanks or security-challenge results before using the data for contact research.
Scope
Prerequisites for scraping Pagesjaunes emails
You need UScraper installed, the free Pagesjaunes Emails Scraper template, a small list of reviewed URLs, and a local folder for the CSV export. The input list can include Pagesjaunes detail pages, the business website URLs linked from those profiles, and related Facebook pages. Start with five to ten companies so the first run is easy to compare against the browser.
This is not a "scrape every company in France" workflow. It is a controlled URL loop for contact research, directory audits, local prospect list cleanup, and validation of websites or social pages already connected to a Pagesjaunes profile.
Before a commercial run, review the current Pagesjaunes legal notices and terms, the Pagesjaunes developer portal, and CNIL guidance on reuse of online directory data. This article is a technical runbook, not legal advice.
Email export is not consent. Keep your purpose, lawful basis, retention period, opt-out handling, and suppression process documented before importing any row into a CRM or outreach tool.
Workflow
How the Pagesjaunes email scraper works
The template is built around a known URL list, not open-ended pagination. The bundled project notes that some Pagesjaunes detail pages may show a 403 or security challenge, so the workflow treats the Pagesjaunes detail page, official website, and Facebook page as complementary sources for the same business.
| Block | What it does | What to verify |
|---|---|---|
Navigate | Opens each URL in navigate.urls[] | Replace the sample restaurant, website, and Facebook URLs. |
Wait for Page Load and Sleep | Gives dynamic pages time to render | Increase waits if rows are partial. |
Inject JavaScript | Scrolls to the bottom of the page | Confirm lazy-loaded footer, social, or contact sections appear. |
Wait for Element | Waits for body to exist | Keep this simple unless you customize per site. |
Structured Export | Writes the configured columns to CSV | Check filename, save folder, headers, and append mode. |
Loop Continue | Advances to the next URL | Stop if repeated rows or challenge pages dominate. |
Import the template
Open the related template page, download the JSON workflow, and import it into UScraper. Keep the block groups visible so load, interaction, extraction, and loop behavior stay easy to audit.
Replace the URL list
Paste your approved Pagesjaunes detail URLs, business websites, and Facebook URLs into the Navigate block. Keep a source spreadsheet beside the workflow so every exported row can be traced later.
Confirm the export path
Structured Export writes pagesjaunes_emails_scraper.csv with headers enabled and file mode set to append. Change the save location before each client, region, or campaign run.
Run a small validation batch
Start with five to ten URLs. Compare the exported names, emails, phone numbers, domains, and source URLs against the pages that UScraper opened.
Clean before outreach
Deduplicate companies, verify email ownership, remove generic or stale contacts, and review the error_message column before any sales or marketing use.
Export shape and JSON workflow sample
There is no separate CSV sample in the bundle, so the JSON workflow is the authoritative sample of the extraction design. The export is intentionally wide: contact fields, business context, social links, source URLs, and an error field all travel together.
| CSV column group | Example columns | Why it matters |
|---|---|---|
| Business context | titre_du_business, categorie, temps_d_ouverture, adresse | Keeps each contact tied to a visible business profile. |
| Contact fields | emails, phones, numero_de_telephone, site_du_business | Captures email, phone, and website candidates from reachable pages. |
| Identifiers | siret, siren, domain | Helps deduplicate companies and match records later. |
| Social links | facebook, linkedin, instagram, youtube, tiktok | Preserves public social pages when they are found. |
| Audit fields | start_url, current_url, referrer_url, error_message | Explains where the row came from and flags challenge pages. |
{
"project": {
"name": "Pagesjaunes Emails Scraper",
"description": "Known URL loop over Pagesjaunes detail, business website, and Facebook URLs."
},
"blocks": [
{
"title": "Navigate",
"config": {
"urls": [
"https://www.pagesjaunes.fr/pros/detail?code_etablissement=02424622",
"http://www.lamazonial.fr",
"https://www.facebook.com/lamazonial"
]
}
},
{
"title": "Structured Export",
"config": {
"fileName": "pagesjaunes_emails_scraper.csv",
"includeHeaders": true,
"fileMode": "append",
"columns": [
"titre_du_business",
"url_du_detail_business",
"numero_de_telephone",
"adresse",
"site_du_business",
"emails",
"phones",
"facebook",
"linkedin",
"instagram",
"error_message"
]
}
},
{
"title": "Loop Continue"
}
]
}
Validation
Validate and clean the first Pagesjaunes CSV export
Open the CSV immediately after the first run. A good validation batch has recognizable business names, the same source URLs you supplied, realistic phone formats, emails from the expected domain or public page, and blank cells only where the live page does not publish that field.
| Symptom | Likely cause | Fix |
|---|---|---|
| Headers exist but no rows | Page body did not load or the run stopped early | Re-run visibly, increase waits, and test one URL. |
emails is blank | No email is visible, or the website/social page was unavailable | Check the business website manually before treating blank as an error. |
| Phones include unrelated numbers | Page text contains IDs, years, or tracking numbers | Review the row, then tighten the JavaScript filter for your source type. |
error_message mentions a challenge | Pagesjaunes or a linked site returned a security page | Pause, reduce volume, and avoid bypassing access controls. |
| Social links look noisy | Tracking, photo, or login links were collected | Add domain/path filters and rerun a small batch. |
Pagesjaunes scraping RGPD checks
For RGPD-sensitive contact work, treat scraping as the first step in a governed data process, not the final permission decision. Limit the fields you collect, avoid private or login-only areas, keep request volume proportionate, and separate technical success from outreach permission. If the contacts will be used for prospecting, review CNIL guidance, your lawful basis, your opt-out workflow, and your retention schedule before importing rows into any campaign system.
Best when an analyst needs a reviewable local CSV, direct control over the URL list, and a workflow that can include Pagesjaunes, websites, and social pages in the same run.
For adjacent workflows, browse the UScraper template library or the full UScraper blog for more directory, link scraper, and contact export tutorials.
FAQ
Frequently asked questions
Pagesjaunes business listings may be publicly visible, but reuse of professional directory data can be restricted by Pagesjaunes terms, GDPR, French RGPD rules, database rights, anti-spam rules, and the purpose of your campaign. Review the current policies, collect proportionately, avoid bypassing access controls, and get legal review before outreach.

