Python urlopen如何设置proxy代理？-杰瑞科技汇

Of course! Using a proxy with Python's urllib.request.urlopen is a common task. The standard library provides a clean, built-in way to do this.

（图片来源网络，侵删）

Here’s a comprehensive guide covering the basic method, advanced configurations, and important considerations.

The Core Concept: `ProxyHandler`

The key to using a proxy with urllib is the ProxyHandler class. You create a handler with your proxy URL(s) and install it into an "opener." This opener is then used to open URLs, which will automatically route your request through the specified proxy.

Basic Example: HTTP Proxy

This is the most straightforward case. Let's assume you have an HTTP proxy server.

Proxy Details:

（图片来源网络，侵删）

Host: 123.123.123
Port: 8080

Code

import urllib.request
import urllib.error
# Define your proxy details
proxy_host = '123.123.123.123'
proxy_port = '8080'
proxy_address = f'http://{proxy_host}:{proxy_port}'
# The URL you want to visit
url_to_visit = 'http://httpbin.org/ip' # This site shows your public IP address
try:
    # 1. Create a proxy handler
    # This tells urllib to use the proxy for all HTTP and HTTPS connections.
    proxy_handler = urllib.request.ProxyHandler({
        'http': proxy_address,
        'https': proxy_address  # Many proxies handle both HTTP and HTTPS
    })
    # 2. Build an opener with the proxy handler
    opener = urllib.request.build_opener(proxy_handler)
    # 3. Install the opener (optional but recommended)
    # This makes urllib.request.urlopen() use your opener by default.
    urllib.request.install_opener(opener)
    # 4. Open the URL using the standard urlopen function
    # It will now automatically use the proxy we configured.
    print(f"Requesting {url_to_visit} via proxy {proxy_address}...")
    with urllib.request.urlopen(url_to_visit, timeout=10) as response:
        # Read and decode the response content
        response_data = response.read().decode('utf-8')
        print("\n--- Response from Server ---")
        print(response_data)
        print("---------------------------\n")
except urllib.error.URLError as e:
    print(f"Error: Failed to reach the server. Reason: {e.reason}")
    print("This could be due to an incorrect proxy, proxy being down, or network issues.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Explanation

ProxyHandler({'http': ..., 'https': ...}): We create a dictionary mapping the protocol (http, https) to the proxy's URL. If you only need a proxy for http, you can omit the https key.
build_opener(proxy_handler): This function creates an "opener," which is an object capable of opening URLs. We pass our ProxyHandler to it to tell the opener how to handle proxies.
install_opener(opener): This is a convenience function. It installs your custom opener so that any subsequent calls to urllib.request.urlopen() will automatically use it. If you don't want to change the global behavior, you can skip this step and use opener.open(url) directly.

Advanced Configurations

a) SOCKS Proxy

The standard urllib library does not support SOCKS proxies directly. For SOCKS, you need a third-party library like requests (with requests[socks]) or urllib3 with a SOCKS proxy handler.

However, if you must use urllib, you would need to manually establish a SOCKS connection and tunnel your HTTP request through it, which is complex.

Easier Alternative with requests:

First, install the necessary library:

（图片来源网络，侵删）

pip install requests[socks]

Then, use it like this:

import requests
proxy_host = 'your_socks_proxy.com'
proxy_port = '1080'
proxies = {
    'http': f'socks5://{proxy_host}:{proxy_port}',
    'https': f'socks5://{proxy_host}:{proxy_port}'
}
try:
    response = requests.get('http://httpbin.org/ip', proxies=proxies, timeout=10)
    print(response.json())
except requests.exceptions.ProxyError as e:
    print(f"Proxy Error: {e}")

b) Authentication for Proxies

If your proxy requires a username and password, you simply add them to the proxy URL.

Proxy Details:

Host: secure-proxy.com
Port: 8080
Username: myuser
Password: mypassword

Code

import urllib.request
import base64
proxy_host = 'secure-proxy.com'
proxy_port = '8080'
username = 'myuser'
password = 'mypassword'
# Format: http://username:password@host:port
proxy_address = f'http://{username}:{password}@{proxy_host}:{proxy_port}'
proxy_handler = urllib.request.ProxyHandler({
    'http': proxy_address,
    'https': proxy_address
})
opener = urllib.request.build_opener(proxy_handler)
# No need to install_opener if you use opener.open directly
url_to_visit = 'http://httpbin.org/ip'
try:
    with opener.open(url_to_visit, timeout=10) as response:
        print(response.read().decode('utf-8'))
except urllib.error.HTTPError as e:
    if e.code == 407:
        print("Error: Proxy Authentication Failed.")
    else:
        print(f"HTTP Error: {e.code} - {e.reason}")
except Exception as e:
    print(f"An error occurred: {e}")

Important Considerations & Best Practices

a) Proxy Rotation

If you are using proxies for web scraping, sending too many requests through a single IP will get you blocked. You need to rotate proxies.

A simple way to do this is to have a list of proxies and pick one at random for each request.

import random
import urllib.request
# List of working proxies
PROXY_LIST = [
    'http://123.123.123.123:8080',
    'http://124.124.124.124:3128',
    'http://125.125.125.125:8888',
]
def get_random_proxy():
    return random.choice(PROXY_LIST)
def make_request(url):
    proxy_address = get_random_proxy()
    print(f"Using proxy: {proxy_address}")
    proxy_handler = urllib.request.ProxyHandler({
        'http': proxy_address,
        'https': proxy_address
    })
    # Create a new opener for each request to handle proxy failures gracefully
    opener = urllib.request.build_opener(proxy_handler)
    try:
        with opener.open(url, timeout=10) as response:
            return response.read().decode('utf-8')
    except urllib.error.URLError as e:
        print(f"Proxy {proxy_address} failed. Reason: {e.reason}")
        # In a real scraper, you'd remove this proxy from your list and try again
        return None
# --- Usage ---
url = 'http://httpbin.org/ip'
response_data = make_request(url)
if response_data:
    print("\n--- Success! ---")
    print(response_data)

b) Timeouts

Always set a timeout when opening URLs. Proxies can be slow or unresponsive, and your script could hang indefinitely otherwise. A value between 10-30 seconds is common.

# Good
with urllib.request.urlopen(url, timeout=15) as response:
    ...
# Bad - can hang forever
# with urllib.request.urlopen(url) as response:
#     ...

c) HTTPS and SSL Proxies

When you specify a proxy for https, urllib will perform an SSL/TLS handshake with the final destination server (e.g., google.com) but the connection will be routed through the proxy. The proxy itself can see the unencrypted traffic unless it's an SSL/TLS proxy that performs its own MITM (Man-in-the-Middle) inspection.

d) Error Handling

Proxies are less reliable than direct connections. You should always wrap your urlopen calls in a try...except block to catch:

urllib.error.URLError: For general network issues, including proxy connection failures.
urllib.error.HTTPError: For specific HTTP error codes, like 407 Proxy Authentication Required.

Summary: `urllib` vs. `requests`

Feature	`urllib.request` (Standard Library)	`requests` (3rd Party Library)
Ease of Use	More verbose, requires manual setup of handlers and openers.	Much simpler, more intuitive API.
Proxy Support	HTTP/HTTPS only. No built-in SOCKS support.	Excellent support for HTTP, HTTPS, and SOCKS (with `requests[socks]`).
Authentication	Built-in, just add `user:pass@` to the URL.	Very flexible, supports different auth types (basic, digest, etc

Python urlopen如何设置proxy代理？

The Core Concept: `ProxyHandler`

Basic Example: HTTP Proxy

Code

Explanation

Advanced Configurations

a) SOCKS Proxy

b) Authentication for Proxies

Code

Important Considerations & Best Practices

a) Proxy Rotation

b) Timeouts

c) HTTPS and SSL Proxies

d) Error Handling

Summary: `urllib` vs. `requests`

99ANYc3cd6

Java String大小写转换如何实现？

HTML5和CSS3实例教程，从哪学起？

python rabbitmq 广播

java access 2025

如何快速用PS做出专业证件照？

ubuntu安装教程16.04

Java实现Webservice实例的关键步骤是什么？

TortoiseSVN教程，新手如何快速上手？

Java Android环境变量配置步骤是怎样的？

sliverlight爬虫python

Java数值转字符串，哪种方法最优？

Python SQLAlchemy中in条件如何高效使用？

Python unicode如何正确转为json格式？

Premiere 2.0教程怎么学？新手入门指南？

python epoll eagain

SQL Server 2012安装教程，详细步骤是怎样的？

Python urlopen如何设置proxy代理？

The Core Concept: ProxyHandler

Basic Example: HTTP Proxy

Code

Explanation

Advanced Configurations

a) SOCKS Proxy

b) Authentication for Proxies

Code

Important Considerations & Best Practices

a) Proxy Rotation

b) Timeouts

c) HTTPS and SSL Proxies

d) Error Handling

Summary: urllib vs. requests

相关推荐

ubuntu安装教程16.04

The Core Concept: `ProxyHandler`

Summary: `urllib` vs. `requests`