Guide to Python Requests Headers

by Ziad Shamndy Nov 04, 2025

#python #requests #http

Your headers looked just like Chrome's. The first 200 test requests worked fine. But when you tried to scrape more, you got a 403 Forbidden error. The site's anti-bot system still caught you. Why? Your Python requests library has a different TLS fingerprint than Chrome.

Your headers said you were Chrome, but the underlying HTTPS connection gave you away as a Python script. Sending too many requests triggered this check, and the mismatch got you blocked.

Headers are more than just data. They're like fingerprints. Anti-bot systems use them to spot scrapers. Get one thing wrong, and you'll be blocked and waste time.

This guide shows you which headers matter, how anti-bot systems work, and why headers alone are often not enough. We'll show you header setups that work for different sites and when you might need to use other tools.

For those new to the library, install it with pip install requests to follow along.

Key Takeaways

Headers are Fingerprints: Anti-bot systems check more than just your User-Agent. They look at header order, letter casing, and something called a TLS fingerprint.
TLS Fingerprinting is Key: The requests library has a TLS fingerprint that's different from a real browser, which gets you blocked when you send too many requests.
Volume Triggers Detection: Scraping a few pages might work fine. But when you scrape a lot, sites run deeper checks and block you.
requests Has Its Limits: requests is good for simple sites, but it's not enough for tougher sites like e-commerce or job boards.
Production Scraping is Complex: To scrape tough sites, you need to handle header rotation, TLS fingerprints, proxies, and sessions.
Managed Solutions Simplify Scraping: Tools like ScrapFly take care of all the hard parts for you, so you can just get the data you need.

What Are Headers in Python Requests?

In HTTP, headers are key-value pairs that are sent with every request and response to tell the server what to do. They are a key part of how clients and servers talk to each other. For instance, headers can tell the server about the type of device sending the request, or whether the client wants a JSON response.

Each request starts a conversation between the client (like a browser or your script) and the server, with headers acting as instructions. The most common headers include:

Content-Type: Shows the media type (e.g., application/json), helping the server understand the format of data you're sending.
Authorization: Used for sending login details or API tokens to access protected pages.
User-Agent: Identifies your client application, which helps servers tell real users apart from bots.
Accept: Tells the server what content types (e.g., JSON, XML) your client can handle, so the server can send back a format you can understand.
Cookie: Sends stored cookies to keep you logged in or remember your session.
Cache-Control: Controls caching behavior, like how long to store a copy of a page.

Headers can be easily managed using Python’s requests library. This lets you get headers from a response or set custom headers to customize each request.

Example: Getting Headers with Python Requests

In Python, you can get headers from a response using response.headers.

import requests

response = requests.get('https://httpbin.dev')
print(response.headers)
{
  "Access-Control-Allow-Credentials": "true",
  "Access-Control-Allow-Origin": "*",
  "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev",
  "Content-Type": "text/html; charset=utf-8",
  "Date": "Fri, 25 Oct 2024 14:14:02 GMT",
  "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()",
  "Referrer-Policy": "strict-origin-when-cross-origin",
  "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload",
  "X-Content-Type-Options": "nosniff",
  "X-Xss-Protection": "1; mode=block",
  "Transfer-Encoding": "chunked"
}

The output shows headers the server sends back, with details like

media type Content-Type
security policies (Content-Security-Policy)
allowed origins (Access-Control-Allow-Origin).

Example: Setting Custom Headers

Custom headers, like adding a User-Agent for device emulation, can make requests appear more authentic:

headers = {'User-Agent': 'my-app/0.0.1'}
response = requests.get('https://httpbin.dev/headers', headers=headers)
print(response.json())
{
"headers": {
  "Accept": ["*/*"],
  "Accept-Encoding": ["gzip, deflate"],
  "Host": ["httpbin.dev"],
  "User-Agent": ["my-app/0.0.1"],
  "X-Forwarded-For": ["45.242.24.152"],
  "X-Forwarded-Host": ["httpbin.dev"],
  "X-Forwarded-Port": ["443"],
  "X-Forwarded-Proto": ["https"],
  "X-Forwarded-Server": ["traefik-2kvlz"],
  "X-Real-Ip": ["45.242.24.152"]
}}

This setup helps your request appear more like it came from a browser, which can prevent you from being blocked.

Why Headers Fail on Real Sites

Even if you copy a browser's headers perfectly, your requests can still get blocked. This is because modern anti-bot systems check for other patterns to tell if you're a real user or a scraper.

The Header Fingerprinting Problem

Anti-bot systems create a "fingerprint" of each request by looking at a mix of things. One of the most common is TLS fingerprinting, which identifies the tool used to send the request.

The Python requests library has a unique TLS fingerprint that is different from any major web browser. When you send a request with headers that say "I'm Chrome," but your TLS handshake says "I'm Python," the mismatch gives you away. For more on this, see our guide to TLS fingerprinting.

How Cloudflare, Datadome, and PerimeterX Detect Scrapers

Big anti-bot services like Cloudflare, Datadome, and PerimeterX use a few tricks to find and block bots:

Header Order: They expect headers to be in the same order as a real browser.
Missing Browser Headers: They check for headers like Sec-Fetch-*, which browsers send but scraping scripts often don't.
TLS Fingerprint Mismatch: As mentioned, the TLS signature is checked against the User-Agent header.
Behavior: They track how fast you send requests and how you handle cookies to decide if you're a human.

The Volume Threshold Problem

When you're just testing with a few requests, your scraper might work just fine. Most sites don't run deep checks on every single request. However, once you send a lot of requests, you cross a line that triggers more checks.

At that point, your "perfect" headers are checked more closely, the TLS fingerprint is reviewed, and the mismatch gets you blocked. To scrape a lot of pages, you need to match both the headers and the TLS fingerprint, which means using better tools.

For example, a requests call and a real Chrome request might both work in testing. But only the Chrome request will work when you scrape at a large scale because its TLS fingerprint matches its headers.

This is where a service like ScrapFly becomes very helpful. It uses real browsers with matching TLS signatures, allowing you to scrape as much as you want without worrying about being detected.

Are Headers Case-Sensitive?

A common question is whether header names are case-sensitive.

According to the HTTP rules, header names are not case-sensitive. This means Content-Type, content-type, and CONTENT-TYPE are all the same. However, it's a good idea to stick to the standard capitalization (like Content-Type).

Why Case Sensitivity Matters for Bot Detection

When servers check requests, small details like using the wrong case for letters can give you away. Real browsers use a specific case for letters in header names. While requests handles this for you, some anti-bot systems might still use odd casing as a small clue for their bot detection. It's one of many small "tells" that can help get you flagged as a bot.

In practice, the requests library automatically fixes the case for you. However, header values (like “application/json”) are case-sensitive and must be correct.

Example of Case-Insensitive Headers

In requests, you can set headers in any case, and it will work correctly:

import requests

# Setting 'content-type' in lowercase
headers = {'content-type': 'application/json'}
response = requests.post('https://httpbin.dev/api', headers=headers)
print(response.request.headers)
{
  "Content-Type": "application/json",
  "User-Agent": "python-requests/2.28.1",
  "Accept-Encoding": "gzip, deflate",
  "Accept": "*/*",
  "Connection": "keep-alive"}

As shown above, requests automatically converted content-type to the standard Content-Type. This demonstrates that Python’s requests library will normalize header names for you, maintaining compatibility with web servers regardless of the case used in the original code.

Does Header Order Matter?

In most standard API interactions, the order of headers sent with a Python requests headers call does not affect functionality, as the HTTP specification does not require a specific order for headers. However, when dealing with advanced anti-bot and anti-scraping systems, header order can play an unexpectedly significant role in determining whether a request is accepted or blocked.

Why Header Order Matters for Bot Detection

Anti-bot systems like Cloudflare, DataDome, and PerimeterX check the exact order of your headers. Browsers send headers in a consistent order. Tools like requests often use a different order.

This difference is a big red flag. Trying to set the header order yourself is a pain and breaks easily. Anti-bot companies always change their rules, so you would need to figure out the new header order every time your scraper breaks.

This annoying work is why many developers use a service like ScrapFly, which automatically handles header order for you.

Example: Browser Headers vs. Python Requests Headers

A browser might send headers in this order:

{
  "User-Agent": "Mozilla/5.0...",
  "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9",
  "Accept-Language": "en-US,en;q=0.5",
  "Accept-Encoding": "gzip, deflate, br",
  "Referer": "https://example.com",
  "Connection": "keep-alive"
}

With Python’s requests library, headers might look slightly different:

import requests

headers = {
    "Accept": "application/json",
    "User-Agent": "my-scraper/1.0",
    "Connection": "keep-alive",
    "Referer": "https://httpbin.dev"
}
response = requests.get("https://httpbin.dev/headers", headers=headers)
print(response.json())  # This output may vary based on server handling
{
  "headers": {
    "Accept": "application/json",
    "User-Agent": "my-scraper/1.0",
    "Connection": "keep-alive",
    "Referer": "https://httpbin.dev"
  }
}

This slight difference in header ordering can hint to anti-bot systems that the request might be automated, especially if combined with other signals, such as the User-Agent format or missing headers.

By analyzing this order, advanced detection systems can identify patterns often associated with automated scripts or bots. When a request does not match the usual order, the server may assume it’s coming from a bot, potentially resulting in blocked requests or captcha challenges.

Standard Headers in Python Requests

To make your requests look like they come from a browser, it's helpful to know which headers are standard.

Key Standard Headers

User-Agent: Identifies your browser and OS.
Accept: Tells the server what content types you can handle.
Accept-Language: Your preferred language (e.g., en-US).
Accept-Encoding: Compression methods you accept (e.g., gzip).
Referer: The URL of the page you came from.
Connection: Usually set to keep-alive.

Verifying Browser Headers

To make sure your headers look real:

Browser Developer Tools: Use the Network tab in your browser's developer tools to see the headers for any request. You can copy these for your scraper.
Proxy Tools: Tools like Fiddler let you see and edit HTTP headers.

Example: Mimicking Headers in Python

import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ...',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,...',
    'Accept-Language': 'en-US,en;q=0.5',
    'Accept-Encoding': 'gzip, deflate, br',
    'Referer': 'https://httpbin.dev',
    'Connection': 'keep-alive'
}
response = requests.get('https://httpbin.dev', headers=headers)
print(response.status_code)

Using headers like these makes your request look more like it came from a real user.

Importance of the User-Agent String

The User-Agent is very important. It tells the server what browser, OS, and device you are using so it can send back content that works for you.

You can learn more in our guide on

How to Effectively Use User Agents for Web Scraping

In this article, we’ll take a look at the User-Agent header, what it is and how to use it in web scraping. We'll also generate and rotate user agents to avoid web scraping blocking.

Where Headers Actually Work

Even though anti-bot systems are getting smarter, requests and a simple set of headers can still work in some cases. These are for sites that don't have strong bot detection, or where you don't send too many requests.

Scenarios Where Requests + Headers Succeed

Simple News or Blog Sites: Many simple sites don't have strong anti-bot tools. A basic User-Agent and Accept header is often enough.
Internal APIs: Private APIs often just need a simple Authorization header with a key and don't use advanced bot detection.
Small Personal Projects: If you're just scraping a few pages for fun, you probably won't get noticed. It's okay if you get blocked sometimes.

These simple cases work because they don't trigger the deep checks that happen on big, protected sites. But as soon as you try to scrape e-commerce sites or job boards, headers are not enough.

Headers for POST Requests

For POST requests, headers are very important because they tell the server about the data you are sending.

Key Headers for POST Requests

Content-Type: Indicates the data format, such as application/json for JSON data, application/x-www-form-urlencoded for form submissions, or multipart/form-data for files. Setting this correctly ensures the server parses your data as expected.
User-Agent: Identifies the client application, which helps with API access and rate limit policies.
Authorization: Needed for secure endpoints to authenticate requests, often using tokens or credentials.
Accept: Specifies the desired response format (e.g., application/json), aiding in consistent data handling and error processing.

Example Usage of Headers for POST Requests

To send JSON data, set the Content-Type to application/json:

import requests

headers = {
    'Content-Type': 'application/json',
    'User-Agent': 'my-app/0.0.1'
}
data = '{"key": "value"}'
response = requests.post('https://httpbin.dev/api', headers=headers, json=data)

print(response.status_code)
print(response.json())

This helps the server process your data correctly.

Browser-Specific Headers

Some sites check for headers that only real browsers send. Adding these can make your scraper look more human.

Common Browser-Specific Headers

DNT (Do Not Track): Tells the server you don't want to be tracked (1).
Sec-Fetch-Site: Shows where the request is coming from (same-origin, cross-site, none).
Sec-Fetch-Mode: Shows the type of request (navigate for loading a page).
Sec-Fetch-Dest: Shows what kind of content is expected (document, image).

Example of Browser-Specific Headers in Python:

import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36',
    'DNT': '1',                       # Respects user preference for tracking
    'Sec-Fetch-Site': 'none',          # Represents a top-level navigation action
    'Sec-Fetch-Mode': 'navigate',      # Simulates a full-page load
    'Sec-Fetch-Dest': 'document'       # Indicates the content type expected
}
response = requests.get('https://httpbin.dev', headers=headers)
print(response.status_code)
200
print(response.headers)
{
  "Access-Control-Allow-Credentials": "true",
  "Access-Control-Allow-Origin": "*",
  "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev",
  "Content-Type": "text/html; charset=utf-8",
  "Date": "Sun, 27 Oct 2024 11:48:47 GMT",
  "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()",
  "Referrer-Policy": "strict-origin-when-cross-origin",
  "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload",
  "X-Content-Type-Options": "nosniff",
  "X-Xss-Protection": "1; mode=block",
  "Transfer-Encoding": "chunked"
}

By including these headers, you can make your request appear closer to those typically sent by browsers, reducing the likelihood of being flagged as a bot or encountering access restrictions.

Why Use Browser-Specific Headers?

Avoid Bot Detection: They make your requests look like real user traffic.
Better Compatibility: Some sites give different content to bots.
More Successful Requests: They can reduce your chances of being blocked.

Production Scraping Complexity

The truth is that for production web scraping, managing headers is just one part of the problem. To scrape tough sites at a large scale, you need to build and manage a lot of complex code, which is more than just rotating headers.

What You'd Need to Build Yourself

Header Rotation: You need a large list of real browser headers. You have to rotate them in a smart way, making sure a Chrome User-Agent is sent with other Chrome headers. See our guide to User-Agent rotation.
TLS Fingerprint Management: The requests library's TLS fingerprint is a big giveaway. You'd need to use other tools like httpx or curl_cffi to copy a browser's TLS signature. This makes things more complex and can be slower.
Proxy Coordination: Your headers should be matched with your proxies. For example, a German language header should be sent from a German proxy. Matching them is key to not getting blocked. Learn more in our guide to proxy rotation.
Session Management: You need to manage cookies to keep a session for logins or rotate them when scraping multiple pages at once.
Monitoring and Adapting: Anti-bot rules change all the time. You would need to build a system to track how often you get blocked, then figure out the new rules and change your code.

Building and managing all this is a lot of work. This is why many teams use a service to handle the hard parts. With ScrapFly, our team handles all of this so you can focus on getting data, not on getting blocked.

Power-up with ScrapFly

The last section showed how hard it is to scrape at a large scale. The endless cycle of building, fixing, and re-building is why many teams switch to ScrapFly.

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

Anti-bot protection bypass - scrape web pages without blocking!
Rotating residential proxies - prevent IP address and geographic blocks.
JavaScript rendering - scrape dynamic web pages through cloud browsers.
Full browser automation - control browsers to scroll, input and click on objects.
Format conversion - scrape as HTML, JSON, Text, or Markdown.
Python and Typescript SDKs, as well as Scrapy and no-code tool integrations.

DIY Approach:

# DIY: header rotation, proxy management, TLS fingerprinting, failure handling...
def get_headers():
    # code to get a random, real header
    return headers

def get_proxy():
    # code to get a rotating proxy
    return proxy

def make_request_with_curl_cffi(url):
    # complex setup with curl_cffi to copy a browser
    ...

try:
    # manage headers, proxies, and TLS
    make_request_with_curl_cffi("https://example.com")
except:
    # handle blocks, retries, etc.
    ...

ScrapFly API:

from scrapfly import ScrapflyClient, ScrapeConfig

scrapfly = ScrapflyClient(key="YOUR_API_KEY")
result = scrapfly.scrape(ScrapeConfig(
    url="https://example.com",
    asp=True, # turn on anti-scraping protection
))

It's much simpler and more powerful.

Try for FREE! More on Scrapfly

FAQs

Why do my headers work in testing but fail in production?

This is usually because you're sending too many requests. In testing, your low number of requests doesn't trigger extra anti-bot checks. In production, more requests trigger TLS fingerprinting. Even with perfect headers, a different TLS fingerprint will get you blocked. ScrapFly solves this by using real browser TLS profiles.

How often do I need to update my header patterns?

For sites with strong protection like Cloudflare or Datadome, you may need to update your headers every few weeks or even days. These systems change their rules all the time. This is a lot of work to maintain, and ScrapFly handles it for you.

When should I use browser automation instead of requests?

Use browser automation tools like Playwright or Selenium when you need to load JavaScript, handle clicks, or when you're blocked by TLS fingerprinting. While these tools solve the TLS problem, they are slower and use more computer resources. ScrapFly offers both simple requests and full browser automation.

Summary

Headers are a basic part of web scraping, but they are also the main way anti-bot systems find and block scrapers. As we've seen, just copying a browser's headers is not enough. Things like TLS mismatches and wrong header order make scraping with requests a pain.

For simple sites, requests and a few headers can work. But for tough sites, you need a lot more. This means rotating headers and proxies, managing TLS fingerprints, and always updating your code.

You have two choices:

Option A: Build and manage all this complex stuff yourself.
Option B: Use ScrapFly and just focus on getting the data you need.

The choice depends on how you want to spend your time.

Guide to Python Requests Headers

Explore this Article with AI

Key Takeaways

What Are Headers in Python Requests?

Why Headers Fail on Real Sites

The Header Fingerprinting Problem

How Cloudflare, Datadome, and PerimeterX Detect Scrapers

The Volume Threshold Problem

Are Headers Case-Sensitive?

Why Case Sensitivity Matters for Bot Detection

Example of Case-Insensitive Headers

Does Header Order Matter?

Why Header Order Matters for Bot Detection

Standard Headers in Python Requests

Key Standard Headers

Verifying Browser Headers

Importance of the User-Agent String

How to Effectively Use User Agents for Web Scraping

Where Headers Actually Work

Scenarios Where Requests + Headers Succeed

Headers for POST Requests

Key Headers for POST Requests

Browser-Specific Headers

Common Browser-Specific Headers

Why Use Browser-Specific Headers?

Production Scraping Complexity

What You'd Need to Build Yourself

Power-up with ScrapFly

FAQs

Why do my headers work in testing but fail in production?

How often do I need to update my header patterns?

When should I use browser automation instead of requests?

Summary

Explore this Article with AI

Related Knowledgebase

How to save and load cookies in Python requests?

Python httpx vs requests vs aiohttp - key differences

What Python libraries support HTTP2?

How to use proxies with Python httpx?

How to use cURL in Python?

How to fix Python requests MissingSchema error?

How to fix Python requests ReadTimeout error?

How to fix Python requests TooManyRedirects error?

How to fix python requests ConnectTimeout error?

How to Copy as cURL With Edge?

How to Copy as cURL With Safari?

How To Copy as cURL With Google Chrome?

Related Articles

Guide to Python requests POST method

How to Scrape Naver.com

How to Scrape Imovelweb.com

How to Scrape AutoScout24

How to Scrape Allegro.pl