Scraping Paginated Websites: Complete Guide

beginner

Pagination is a web design pattern that splits large sets of content across multiple pages. When scraping, handling pagination means automatically navigating through all pages to collect the complete dataset.

Types of Pagination

1. URL-Based Pagination

The simplest type — page numbers or offsets in the URL.

code

https://example.com/products?page=1
https://example.com/products?page=2
https://example.com/products?offset=0&limit=20
https://example.com/products?offset=20&limit=20

python

import requests
from bs4 import BeautifulSoup
all_products = []
for page in range(1, 50):
    response = requests.get(f"https://example.com/products?page={page}")
    soup = BeautifulSoup(response.text, "lxml")
    products = soup.select(".product-card")
    if not products:
        break  # no more pages
    all_products.extend(products)

2. Next-Button Pagination

Follow "Next" links until there are none.

python

url = "https://example.com/products"
while url:
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "lxml")
    # Extract data...
    next_link = soup.select_one("a.next-page")
    url = next_link["href"] if next_link else None

3. Infinite Scroll

Content loads as you scroll down — requires JavaScript or API interception.

python

# Usually these sites use an API endpoint
# Check the Network tab in DevTools for API calls
import requests
page = 1
while True:
    response = requests.get(f"https://api.example.com/products?page={page}")
    data = response.json()
    if not data["results"]:
        break
    process(data["results"])
    page += 1

4. Load More Button

Similar to infinite scroll, but triggered by a button click. Same API interception approach works.

Common Pagination Pitfalls

•Missing last pages: Always verify you've reached the end
•Duplicate data: Some sites return the last page repeatedly — check for duplicates
•Rate limiting: Paginated scraping means many requests — add delays
•Changing data: If the site updates while you scrape, you might miss or duplicate items

Scraping Paginated Websites: Complete Guide

Types of Pagination

1. URL-Based Pagination

2. Next-Button Pagination

3. Infinite Scroll

4. Load More Button

Common Pagination Pitfalls

Related Terms

Web Crawling

Rate Limiting

JavaScript Rendering

API Scraping

Related Articles

How to Handle Pagination in Web Scraping (5 Patterns)

How to Scrape E-Commerce Product Data with Python

Learn Pagination hands-on