Why do price API responses take several seconds?

The response time is dominated by browser rendering. A headless browser needs to navigate to the URL, wait for JavaScript to execute, wait for the price element to render, and then extract the data. This typically takes 2-6 seconds. The API processing and network overhead add another second or two.

Can price APIs extract data from any element, not just prices?

Technically yes — the headless browser has access to the entire rendered page. However, price APIs are purpose-built for pricing data and only return the fields they are designed to extract. For general data extraction, use a web scraping tool like ScraperAPI or Playwright directly.

How often do scrapers break due to website changes?

Major retailers like Amazon change their HTML structure several times per year. Minor retailers change less frequently. A well-maintained price API has monitoring that detects breaks within hours and engineering resources to fix selectors quickly — usually same day for major retailers.

Do price APIs cache results?

Some do, some don't. Cached APIs (like PricesAPI) pre-crawl products and serve from a database — fast but potentially stale. Live APIs (like PriceFetch and Rainforest) scrape on each request — slower but always current. Check the API's documentation to understand their approach.

Getting Started4 min readUpdated Mar 22, 2026

How Price APIs Work: A Technical Deep Dive

TL;DR

Price APIs use headless browsers (Playwright/Puppeteer) to render product pages, CSS selectors to locate price elements, and parsing logic to convert raw text into structured JSON. The hard part is handling anti-bot measures, edge cases, and ongoing selector maintenance.

The Scraping Pipeline

When you call a price API, a multi-step pipeline executes behind the scenes:

1. URL validation: the API checks that your URL belongs to a supported retailer domain and is a valid product page URL (not a search results page, category page, or homepage).

2. Browser rendering: a headless browser (typically Chromium via Playwright or Puppeteer) navigates to the product URL. The browser executes JavaScript, loads dynamic content, and renders the page as a real browser would. This is necessary because most e-commerce sites load prices dynamically via JavaScript — a simple HTTP GET request would return an empty price placeholder.

3. Resource blocking: to speed up page loads, the browser blocks unnecessary resources like images, fonts, CSS stylesheets, and tracking scripts. Only the HTML structure and JavaScript needed to render pricing data are loaded.

4. Element selection: CSS selectors target the price element on the page. Each retailer has different HTML structure, so each requires retailer-specific selectors. Amazon's price might be in a `.a-price .a-offscreen` element while Walmart's is in a `[data-testid="price"]` element.

5. Parsing: the raw text content (e.g., "$29.99" or "EUR 24,99") is converted to a numeric value and currency code. This handles locale-specific formatting — commas as decimal separators in European pricing, yen with no decimal places, etc.

6. Response: the structured data is returned as JSON with price, currency, stock status, and metadata.

Why Headless Browsers Are Necessary

Modern e-commerce sites are JavaScript applications. When you load an Amazon product page, the initial HTML contains layout scaffolding but no price data. The price is injected by JavaScript after the page loads — often through API calls that the browser makes to Amazon's internal services.

A simple HTTP request with a library like `requests` or `fetch` only gets the initial HTML. No JavaScript executes, so no price appears. Headless browsers solve this by running a full Chromium instance that executes JavaScript, waits for dynamic content to render, and then gives you the fully-rendered DOM.

The trade-off is resource usage. Each headless browser instance consumes 200-500MB of RAM. Running multiple concurrent instances requires significant server resources. This is one reason price APIs are not free — the infrastructure costs are real.

Some simpler retail sites do include prices in their initial HTML (server-side rendered). For those, a direct HTTP request suffices and is much faster. Price APIs typically use the fastest approach available for each retailer.

CSS Selectors and Price Parsing

Finding the price on a product page requires knowing where to look. Each retailer uses different HTML structure, class names, and data attributes. A scraper for Amazon needs different selectors than a scraper for Walmart.

Selectors are the most fragile part of the pipeline. When a retailer redesigns their product page or changes their CSS class names, selectors break. Amazon changes their price element structure multiple times per year. A professional price API has automated monitoring that detects selector failures quickly and engineering resources to update them.

Once the raw price text is extracted, it needs to be parsed into a number. This is harder than it sounds:

- "$29.99" is 29.99 USD - "29,99 EUR" is 29.99 EUR (comma as decimal separator) - "2,499" is 2499.00 (comma as thousands separator in US format) - "2.499,00" is 2499.00 (European format with dot as thousands separator) - "\u00a529,800" is 29800 JPY (yen, no decimal places)

A good price parser handles all of these formats by using the retailer's locale and currency conventions rather than making assumptions about number formatting.

python

# Simplified price parsing logic
import re
from decimal import Decimal

def parse_price(text: str, locale: str = "en-US") -> Decimal:
    # Remove currency symbols and whitespace
    cleaned = re.sub(r"[^\d.,]", "", text.strip())

    if locale in ("de-DE", "fr-FR", "it-IT"):
        # European: 1.234,56 -> 1234.56
        cleaned = cleaned.replace(".", "").replace(",", ".")
    else:
        # US/UK: 1,234.56 -> 1234.56
        cleaned = cleaned.replace(",", "")

    return Decimal(cleaned)

Handling Anti-Bot Measures

Retailers do not want bots visiting their pages. They deploy several countermeasures:

IP rate limiting: too many requests from the same IP address trigger blocks. Price APIs use pools of rotating proxy IPs to distribute requests across many addresses.

CAPTCHAs: retailers serve CAPTCHAs when they suspect automated access. Some APIs integrate CAPTCHA solving services. Others rely on proxy quality and browser fingerprinting to avoid triggering CAPTCHAs in the first place.

Browser fingerprinting: sites check browser properties like screen size, installed fonts, WebGL rendering, and plugin lists to identify headless browsers. Modern scraping tools randomize these properties to appear as regular browsers.

JavaScript challenges: some sites run JavaScript that tests for browser automation (checking for Playwright/Puppeteer-specific properties). Stealth plugins patch these detectable properties.

This is an arms race. Retailers improve their detection, scraping tools improve their evasion. Price APIs invest continuously in staying ahead of detection methods — that ongoing investment is part of what you pay for.

Scaling the Infrastructure

A production price API processes thousands of concurrent requests. The architecture typically includes:

A browser pool: pre-started Chromium instances that are reused across requests. Starting a new browser per request is too slow. The pool manages allocation, recycling, and crash recovery.

A queue system: incoming API requests are queued when all browser instances are busy. This prevents overloading the server and provides backpressure to callers via rate limiting.

Retailer-specific scrapers: each retailer has its own scraper module with custom selectors, parsing logic, and edge case handling. A URL router maps the incoming URL to the correct scraper.

Monitoring and alerting: automated checks detect when a scraper's success rate drops (indicating broken selectors), when response times spike, or when a retailer starts blocking requests more aggressively.

The entire pipeline from API call to JSON response typically takes 3-8 seconds, with most of that time spent waiting for the browser to render the product page.

How Price APIs Work: A Technical Deep Dive

The Scraping Pipeline

Why Headless Browsers Are Necessary

CSS Selectors and Price Parsing

Handling Anti-Bot Measures

Scaling the Infrastructure

Frequently asked questions

Related Guides

Related Retailers

Start fetching prices — 500 free credits