杰瑞科技汇

Python如何获取headers?

Of course! Getting HTTP headers in Python is a common task, and there are several ways to do it depending on your needs. The most popular library for this is the requests library, but I'll also show you how to do it with Python's built-in modules.

Here’s a breakdown from the most common method to more advanced scenarios.


The Most Common Method: Using the requests Library

The requests library is the de facto standard for making HTTP requests in Python. It's powerful, user-friendly, and the recommended approach for most use cases.

Step 1: Install requests

If you don't have it installed, open your terminal or command prompt and run:

pip install requests

Step 2: Get Headers from a Response

When you make a request to a URL, the server sends back a "response" object. This object contains the headers, status code, and the content (like HTML or JSON).

Here's how to get the headers sent by the server in its response:

import requests
# The URL you want to inspect
url = 'https://www.google.com'
try:
    # Make a GET request to the URL
    response = requests.get(url)
    # The response object has a .headers attribute, which is a dictionary-like object
    server_headers = response.headers
    print("--- Server Response Headers ---")
    # You can print the entire headers object
    print(server_headers)
    print("\n--- Accessing a Specific Header ---")
    # You can access a specific header by its key (case-insensitive)
    # Common headers include 'Content-Type', 'Server', 'Content-Length'
    content_type = server_headers.get('Content-Type')
    server_name = server_headers.get('Server')
    print(f"Content-Type: {content_type}")
    print(f"Server: {server_name}")
    # If a header doesn't exist, .get() returns None, which is safe
    print(f"X-Custom-Header: {server_headers.get('X-Custom-Header')}")
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

Example Output:

--- Server Response Headers ---
{
    'Date': 'Wed, 27 Sep 2025 10:30:00 GMT',
    'Expires': '-1',
    'Cache-Control': 'private, max-age=0',
    'Content-Type': 'text/html; charset=ISO-8859-1',
    'Content-Security-Policy-Report-Only': "object-src 'none';base-uri 'self';script-src 'none'",
    'Server': 'gws',
    'X-XSS-Protection': '0',
    'X-Frame-Options': 'SAMEORIGIN',
    'Set-Cookie': [ ... ], # Cookies are also part of the headers
    'Alt-Svc': 'h3=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000'
}
--- Accessing a Specific Header ---
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-Custom-Header: None

Getting the Headers You Sent (Request Headers)

Sometimes, you want to see the headers that your Python script is sending to the server. You can do this by providing a headers dictionary to your requests call.

import requests
# Define custom headers you want to send
custom_headers = {
    'User-Agent': 'MyCoolWebScraper/1.0',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate, br',
    'X-My-Custom-Header': 'This is a custom value'
}
url = 'https://httpbin.org/headers' # This great site echoes back the headers it receives
try:
    # Pass the headers dictionary to the get() method
    response = requests.get(url, headers=custom_headers)
    # The response from httpbin.org will be a JSON object
    # containing the headers we sent.
    print("--- Headers Sent by Our Request ---")
    print(response.json())
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

Example Output:

{
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate, br",
    "Host": "httpbin.org",
    "User-Agent": "MyCoolWebScraper/1.0",
    "X-My-Custom-Header": "This is a custom value",
    "X-Amzn-Trace-Id": "Root=..."
  }
}

Notice how User-Agent and X-My-Custom-Header are present, showing that our custom headers were successfully sent.


Using Python's Built-in urllib Module

If you can't or don't want to use external libraries, Python's standard library has urllib which can also get headers. It's a bit more verbose.

from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
url = 'https://www.python.org'
try:
    # Create a Request object. You can add headers here.
    request = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
    # Open the request and get the response
    with urlopen(request) as response:
        # The response object has a .getheader() method and a .headers attribute
        server_headers = response.headers
        print("--- Server Response Headers (using urllib) ---")
        # You can print all headers
        print(server_headers)
        print("\n--- Accessing a Specific Header ---")
        # Get a specific header
        content_type = response.getheader('Content-Type')
        print(f"Content-Type: {content_type}")
        # Or access it like a dictionary
        server_name = server_headers.get('Server')
        print(f"Server: {server_name}")
except HTTPError as e:
    print(f"Server couldn't fulfill the request. Error code: {e.code}")
except URLError as e:
    print(f"Failed to reach the server. Reason: {e.reason}")
except Exception as e:
    print(f"An error occurred: {e}")

Getting Headers from an Existing http.client Response

In lower-level networking or when working with frameworks, you might get a response object from http.client. This is common in asynchronous frameworks like aiohttp or httpx.

The principle is the same: the response object will have a headers attribute.

import http.client
# For HTTPS
conn = http.client.HTTPSConnection("www.python.org")
try:
    conn.request("GET", "/")
    # Get the response object
    response = conn.getresponse()
    print(f"Status: {response.status} {response.reason}")
    print("\n--- Server Response Headers (from http.client) ---")
    # The headers are in a specialized http.client.HTTPMessage object
    # You can iterate through it or get items by key
    for header, value in response.getheaders():
        print(f"{header}: {value}")
    print("\n--- Accessing a Specific Header ---")
    print(f"Server: {response.getheader('Server')}")
finally:
    # It's important to close the connection
    conn.close()

Summary

Method When to Use How to Get Headers How to Send Headers
requests Recommended for almost all use cases. Simple and powerful. response.headers requests.get(url, headers={...})
urllib When you can't use external libraries. It's built-in. response.headers or response.getheader() Pass a headers dict to the Request() object.
http.client For lower-level control or when integrating with other network libraries. response.getheaders() or response.getheader() Pass a headers dict to the request() method.

For 99% of Python users, the requests library is the best choice. It simplifies the process and handles many complexities for you.

分享:
扫描分享到社交APP
上一篇
下一篇