Skip to main content
BETAUnder active development. Some features may not work as expected.

HTTP Requests in Web Scraping: GET, POST, Headers & More

beginner

An HTTP request is a message sent from a client (your scraper) to a web server asking for a resource. In web scraping, you send HTTP requests to fetch web pages, then parse the response HTML to extract data.

How HTTP Requests Work in Scraping

When your browser visits a page, it sends an HTTP GET request. The server responds with the HTML. Your scraper does the same thing, just without rendering the page visually.

python
import requests

# Basic GET request response = requests.get("https://example.com/products") print(response.status_code) # 200 print(response.text) # HTML content

# With headers (to look like a real browser) headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36", "Accept-Language": "en-US,en;q=0.9", } response = requests.get("https://example.com/products", headers=headers)

GET vs. POST

  • GET: Fetches a page. Parameters go in the URL. Used for most scraping.
  • POST: Sends data to the server. Used for login forms, search forms, and API endpoints.
python
# POST request (e.g., login)
data = {"username": "user", "password": "pass"}
response = requests.post("https://example.com/login", data=data)

Important Headers for Scraping

  • User-Agent: Identifies your client. Set this to mimic a real browser.
  • Accept: What content types you accept (text/html, application/json)
  • Referer: The page you "came from" — some sites check this
  • Cookie: Session cookies for authenticated scraping

Sessions and Cookies

Use requests.Session() to persist cookies across multiple requests — essential for scraping behind login pages.

python
session = requests.Session()
session.post("https://example.com/login", data={"user": "me", "pass": "secret"})
# Now all requests in this session include the login cookies
response = session.get("https://example.com/dashboard")

Status Codes to Watch For

  • 200: Success
  • 301/302: Redirect (requests follows these automatically)
  • 403: Forbidden — the site is blocking you
  • 429: Too many requests — you're being rate limited
  • 503: Server overloaded or blocking bots

Learn HTTP Request hands-on

This glossary entry covers the basics. The Master Web Scraping course teaches you to use http request in real projects across 16 in-depth chapters.

Get Instant Access — $19

$ need_help?

We're here for you