Getting Started3 min readUpdated Mar 22, 2026

Price API Rate Limiting Explained

TL;DR

Rate limits protect APIs from abuse and ensure fair usage. Handle 429 responses with exponential backoff. Read X-RateLimit headers to stay under limits proactively. Queue your requests rather than firing them all at once.

What Rate Limiting Is and Why It Exists

Rate limiting restricts how many API requests you can make in a given time window. Price APIs typically enforce two types of limits:

Per-second limits (burst limits): restrict how many requests you can make per second. A typical limit is 5-10 requests per second. This prevents a single client from overwhelming the API's browser pool and degrading service for everyone.

Monthly limits (quota limits): restrict your total requests per billing period. This is tied to your plan — free tiers have lower quotas, paid plans have higher ones. When you hit your monthly quota, requests are rejected until the next billing cycle.

Some APIs also impose per-minute or per-hour limits as intermediate rate tiers.

Rate limiting exists for practical infrastructure reasons. Each price API request spins up a headless browser tab that consumes CPU and memory. Without rate limiting, a single client making 1,000 concurrent requests would starve the system of resources and make the API unusable for everyone else.

Reading Rate Limit Headers

Well-designed APIs include rate limit information in response headers so you can monitor your usage proactively:

`X-RateLimit-Limit`: the maximum number of requests allowed in the current window. `X-RateLimit-Remaining`: how many requests you have left in the current window. `X-RateLimit-Reset`: Unix timestamp or seconds until the limit resets.

By checking these headers after each response, you can throttle your requests before hitting the limit. This is smoother than waiting for a 429 error and then backing off.

python
import requests
import time

def fetch_price(url: str, api_key: str) -> dict:
    response = requests.get(
        "https://api.pricefetch.dev/v1/price",
        params={"url": url},
        headers={"X-API-Key": api_key}
    )

    # Check rate limit headers proactively
    remaining = int(response.headers.get("X-RateLimit-Remaining", 1))
    if remaining <= 1:
        reset_at = int(response.headers.get("X-RateLimit-Reset", 0))
        wait_time = max(reset_at - time.time(), 1)
        time.sleep(wait_time)

    if response.status_code == 429:
        # Rate limited — back off and retry
        retry_after = int(response.headers.get("Retry-After", 5))
        time.sleep(retry_after)
        return fetch_price(url, api_key)  # Retry

    return response.json()

Handling 429 Errors with Exponential Backoff

When you exceed the rate limit, the API returns a 429 (Too Many Requests) status code. The correct response is to wait and retry — not to immediately retry, which would make the problem worse.

Exponential backoff is the standard pattern: wait 1 second after the first 429, 2 seconds after the second, 4 seconds after the third, and so on. Add a small random jitter to prevent multiple clients from retrying at the exact same moment (the "thundering herd" problem).

Set a maximum retry count (3-5 retries is reasonable) and a maximum backoff duration (30-60 seconds). If you are still getting 429s after max retries, your request volume is fundamentally too high for your rate limit — you need to either reduce volume or upgrade your plan.

Never retry in a tight loop without delays. This is the fastest way to get your API key temporarily suspended.

python
import time
import random

def fetch_with_backoff(url: str, api_key: str, max_retries: int = 4) -> dict:
    for attempt in range(max_retries):
        response = requests.get(
            "https://api.pricefetch.dev/v1/price",
            params={"url": url},
            headers={"X-API-Key": api_key}
        )

        if response.status_code != 429:
            return response.json()

        # Exponential backoff with jitter
        base_delay = min(2 ** attempt, 30)
        jitter = random.uniform(0, base_delay * 0.5)
        time.sleep(base_delay + jitter)

    raise Exception("Rate limit exceeded after max retries")

Best Practices for Staying Under Limits

Queue your requests. Instead of firing hundreds of requests simultaneously, use a queue with a concurrency limit that matches your per-second rate limit. Libraries like `asyncio.Semaphore` in Python or `p-queue` in Node.js make this straightforward.

Space your requests. If your rate limit is 5 requests per second, add a 200ms delay between requests. This is more reliable than sending 5 at once and hoping they arrive within the same second window.

Batch wisely. If you need to check 1,000 product prices, do not fire all 1,000 at once. Process them in batches that respect your rate limit, with delays between batches.

Cache when appropriate. If you are checking the same product URL multiple times in an hour, cache the first result locally. Prices rarely change minute-to-minute, so a 15-60 minute cache can dramatically reduce your API usage without meaningful accuracy loss.

Monitor your usage. Track how many credits you have consumed and how close you are to your limits. The API dashboard typically shows this, but also track it programmatically by reading response headers.

Frequently asked questions

Related Retailers

Start fetching prices — 500 free credits

Sign up in 30 seconds. No credit card required. One credit per successful API call.