Skip to main content
BETAUnder active development. Some features may not work as expected.

What Is Scrapy? Python Web Crawling Framework Explained

intermediate

Scrapy is an open-source Python framework designed for web crawling and scraping at scale. It provides built-in support for following links, handling retries, managing concurrency, and exporting data through pipelines.

How Scrapy Works

Unlike BeautifulSoup (a library) or Playwright (a browser tool), Scrapy is a full framework. It manages the entire scraping workflow: scheduling requests, downloading pages, parsing responses, and storing data.

python
import scrapy

class ProductSpider(scrapy.Spider): name = "products" start_urls = ["https://example.com/products"]

def parse(self, response): for product in response.css(".product-card"): yield { "name": product.css(".title::text").get(), "price": product.css(".price::text").get(), }

next_page = response.css("a.next-page::attr(href)").get() if next_page: yield response.follow(next_page, self.parse)

Scrapy Architecture

  • Spiders: Define how to crawl and extract data from sites
  • Items: Structure your scraped data (like a schema)
  • Pipelines: Process items after extraction (clean, validate, store)
  • Middlewares: Hook into request/response processing (add proxies, headers)
  • Scheduler: Manages the queue of URLs to crawl

When to Use Scrapy

  • Crawling thousands or millions of pages
  • Following links across an entire site
  • Need built-in retry logic and error handling
  • Exporting to multiple formats (JSON, CSV, databases)
  • Running multiple spiders concurrently

When NOT to Use Scrapy

  • Simple one-off scripts (BeautifulSoup is faster to write)
  • JavaScript-heavy sites (combine with Playwright via scrapy-playwright)
  • Learning web scraping for the first time (steeper learning curve)

Learn Scrapy hands-on

This glossary entry covers the basics. The Master Web Scraping course teaches you to use scrapy in real projects across 16 in-depth chapters.

Get Instant Access — $19

$ need_help?

We're here for you