Google Flights Scraper API: Extract Live Flight Data

A working guide to building a Google Flights scraper with the ScrapingBee Google Flights Scraper API. Google Flights renders prices, routes, and schedules with JavaScript and sits behind aggressive anti-bot protection, so a plain HTTP request returns almost nothing usable. This repo shows how a managed Google Flights API hands off proxy rotation, headless rendering, and structured extraction, and returns clean data from a single call.

curl "https://app.scrapingbee.com/api/v1/?api_key=YOUR_API_KEY&url=https%3A%2F%2Fwww.google.com%2Ftravel%2Fflights&render_js=true&premium_proxy=true"

You need a ScrapingBee API key. The free tier gives you 1,000 credits with no card required: scrapingbee.com.

What the Google Flights API does

Google Flights is a price-comparison surface that aggregates fares from hundreds of airlines and online travel agencies. The data is valuable for fare monitoring, route research, competitive pricing, and travel dashboards, but Google does not publish an official public API for it.

A Google Flights scraper API closes that gap. Instead of running your own headless browser fleet and proxy pool, you send one HTTP request and the service does the hard parts:

Renders the JavaScript that loads fares and itineraries.
Rotates residential proxies so requests are not blocked.
Handles the anti-bot challenges Google serves to automated traffic.
Returns the rendered HTML, a screenshot, or structured JSON you define.

This guide uses the ScrapingBee HTML API. See the full parameter reference in the ScrapingBee documentation.

Why scraping Google Flights is hard

A direct request against https://www.google.com/travel/flights fails for three reasons:

The page is JavaScript-rendered. The initial HTML is a shell. Fares, durations, and stops load after the browser executes the page scripts. requests plus BeautifulSoup sees the shell, not the data.
Google blocks automated traffic. Datacenter IPs get rate-limited or served consent and challenge pages quickly.
The markup changes. Class names are obfuscated and rotate, so any selector you hardcode breaks within weeks.

Naive approach that does not work

import requests
from bs4 import BeautifulSoup

url = "https://www.google.com/travel/flights?q=Flights%20to%20London%20from%20New%20York"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

# Returns the page shell only. No fares, durations, or airlines are present,
# because they are injected by JavaScript that requests never executes.
print(soup.get_text()[:500])

This is exactly the wall a managed Google Flights scraper removes.

Production approach: ScrapingBee HTML API

The HTML API endpoint is:

https://app.scrapingbee.com/api/v1/

Pass the Google Flights URL as the url parameter, keep render_js=true so the fares load, and add premium_proxy=true to route the request through residential proxies. Google serves a consent page to fresh sessions, so set the Google CONSENT cookie to skip it.

cURL

curl "https://app.scrapingbee.com/api/v1/?api_key=YOUR_API_KEY&url=https%3A%2F%2Fwww.google.com%2Ftravel%2Fflights&render_js=true&premium_proxy=true"

The url value must be URL-encoded when you call the API directly. The official Python and Node SDKs encode it for you.

Python

Install the official SDK:

pip install scrapingbee

from scrapingbee import ScrapingBeeClient

client = ScrapingBeeClient(api_key="YOUR_API_KEY")

response = client.get(
    "https://www.google.com/travel/flights?q=Flights to London from New York",
    params={
        "render_js": "true",
        "premium_proxy": "true",
    },
    cookies={"CONSENT": "YES+"},
)

print(response.content)

Node.js

Install the official SDK:

npm install scrapingbee

const { ScrapingBeeClient } = require('scrapingbee');

const client = new ScrapingBeeClient('YOUR_API_KEY');

async function getFlights(url) {
    const response = await client.htmlApi({
        url: url,
        params: {
            render_js: true,
            premium_proxy: true,
        },
    });

    const decoder = new TextDecoder();
    return decoder.decode(response.data);
}

getFlights('https://www.google.com/travel/flights?q=Flights to London from New York')
    .then((html) => console.log(html));

Structured extraction without parsing HTML

Rather than parse rotating, obfuscated markup yourself, ask the API to return structured JSON. ScrapingBee supports CSS and XPath rules through extract_rules, and natural-language rules through ai_extract_rules. The AI extraction adds 5 credits on top of the base request. The full syntax is in the data extraction documentation.

from scrapingbee import ScrapingBeeClient

client = ScrapingBeeClient(api_key="YOUR_API_KEY")

response = client.get(
    "https://www.google.com/travel/flights?q=Flights to London from New York",
    params={
        "render_js": "true",
        "premium_proxy": "true",
        "ai_extract_rules": {
            "flights": {
                "description": "every flight result listed on the page",
                "type": "list",
                "output": {
                    "airline": "name of the airline",
                    "price": "ticket price in dollars",
                    "departure_time": "departure time",
                    "arrival_time": "arrival time",
                    "duration": "total trip duration",
                    "stops": "number of stops",
                },
            },
        },
    },
    cookies={"CONSENT": "YES+"},
)

print(response.json())

The description, type, and nested output keys follow the documented ai_extract_rules schema. type accepts string, list, number, boolean, and item.

Interacting with the page before capture

Google Flights often needs a wait or a scroll before all fares load. The js_scenario parameter scripts the headless browser. The browser DSL is documented under the JavaScript scenario reference. A scenario runs for up to 40 seconds total.

response = client.get(
    "https://www.google.com/travel/flights?q=Flights to London from New York",
    params={
        "render_js": "true",
        "premium_proxy": "true",
        "js_scenario": {
            "instructions": [
                {"wait": 3000},
                {"scroll_y": 1000},
                {"wait": 1000},
            ],
        },
    },
    cookies={"CONSENT": "YES+"},
)

Capturing a screenshot

To save a visual record of a fare board, request a full-page screenshot:

response = client.get(
    "https://www.google.com/travel/flights?q=Flights to London from New York",
    params={
        "render_js": "true",
        "premium_proxy": "true",
        "screenshot_full_page": "true",
    },
    cookies={"CONSENT": "YES+"},
)

with open("flights.png", "wb") as f:
    f.write(response.content)

Key parameters

Every option below maps to a documented HTML API parameter. See the ScrapingBee documentation for the canonical spec.

Parameter	Type	Default	Description
`api_key`	string	required	Your ScrapingBee API key
`url`	string	required	Target Google Flights URL (URL-encode for raw cURL)
`render_js`	bool	true	Execute page JavaScript with a headless browser
`premium_proxy`	bool	false	Use residential proxies for harder targets
`stealth_proxy`	bool	false	Use the stealth tier for the hardest anti-bot sites
`country_code`	string	""	ISO 3166-1 country, requires `premium_proxy=true`
`js_scenario`	JSON	{}	Script clicks, waits, and scrolls before capture
`extract_rules`	JSON	""	CSS or XPath extraction rules
`ai_extract_rules`	JSON	""	Natural-language extraction, adds 5 credits
`wait`	int (ms)	0	Fixed wait before capture
`wait_for`	string	""	CSS or XPath selector to wait for
`screenshot_full_page`	bool	false	Capture a full-page screenshot
`json_response`	bool	false	Wrap the response in a JSON envelope

Credit cost

ScrapingBee bills successful requests. A request that fails with HTTP 500 is not charged.

Scraping a Google URL through the HTML API is billed at a flat rate, and toggling JS rendering does not change it:

Classic or Premium proxy: 20 credits per request.
Stealth proxy: 75 credits per request.
ai_extract_rules or ai_query: adds 5 credits.

So a Google Flights request with AI extraction on the Premium proxy costs 25 credits. Current rate card and plan tiers: ScrapingBee pricing.

Common use cases

Fare monitoring. Track prices on priority routes on a schedule and alert on drops.
Competitive pricing. Compare your published fares against the Google Flights aggregate.
Route research. Pull airlines, durations, and stop counts for a market study.
Travel dashboards. Feed structured fare data into an internal tool or a customer-facing product.

Best practices

Keep render_js=true so fares load before capture.
Set the Google CONSENT cookie to skip the consent interstitial.
Add premium_proxy=true for residential IPs, or stealth_proxy=true for the hardest blocks.
Prefer ai_extract_rules over hardcoded selectors, since Google Flights markup changes often.
Add a short js_scenario wait or scroll when results load slowly.
Scrape public Google Flights pages only. ScrapingBee's terms prohibit scraping behind a login.

FAQ

Is there an official Google Flights API? Google does not publish a public Google Flights API for fare data. A scraping API like ScrapingBee renders the public page and returns the data, which is the practical way to collect it programmatically.

Why not just use requests and BeautifulSoup? Google Flights loads fares with JavaScript and blocks datacenter IPs. A plain request returns an empty shell. You need JavaScript rendering and rotating proxies, which the API handles.

How do I get structured JSON instead of HTML? Use ai_extract_rules for natural-language extraction or extract_rules for CSS and XPath rules. Both return JSON. See the data extraction documentation.

Is scraping Google Flights legal? Public flight data is generally collectible for research, monitoring, and analysis, but local regulations and Google's terms apply. ScrapingBee's terms prohibit scraping content behind a login.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Google Flights Scraper API: Extract Live Flight Data

What the Google Flights API does

Why scraping Google Flights is hard

Naive approach that does not work

Production approach: ScrapingBee HTML API

cURL

Python

Node.js

Structured extraction without parsing HTML

Interacting with the page before capture

Capturing a screenshot

Key parameters

Credit cost

Common use cases

Best practices

FAQ

Further reading

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Google Flights Scraper API: Extract Live Flight Data

What the Google Flights API does

Why scraping Google Flights is hard

Naive approach that does not work

Production approach: ScrapingBee HTML API

cURL

Python

Node.js

Structured extraction without parsing HTML

Interacting with the page before capture

Capturing a screenshot

Key parameters

Credit cost

Common use cases

Best practices

FAQ

Further reading

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages