Python中URL编码怎么用？-杰瑞科技汇

Of course! Here's a comprehensive guide on how to use URL encoding in Python, covering the standard library, third-party libraries, and best practices.

What is URL Encoding?

URL encoding (also known as percent-encoding) is a mechanism to encode information in a Uniform Resource Identifier (URI). Certain characters, like spaces, &, , and , have special meaning in URLs. If you want to include these characters as data, you must encode them.

For example, a space is encoded as %20, and a is encoded as %23.

The Standard Library: `urllib.parse`

Python's built-in urllib.parse module is the standard and recommended way to handle URL encoding and decoding. It provides two main functions: quote() and urlencode().

A. `urllib.parse.quote()` - For encoding individual strings

Use this function to encode a single string that will be used as a component of a URL (like a path, a query parameter value, etc.).

Syntax: urllib.parse.quote(string, safe='/', encoding=None, errors=None)

string: The string to encode.
safe: An additional string of characters that should not be encoded. The default is , which is useful for encoding a full path.
encoding: The encoding to use (e.g., 'utf-8'). Defaults to 'utf-8'.
errors: Specifies how to handle encoding errors (e.g., 'strict', 'ignore').

Example: Encoding a Path and a Query Parameter

from urllib.parse import quote
# A path segment that contains a space
path_segment = "my documents"
encoded_path = quote(path_segment)
print(f"Path: /{encoded_path}")
# Output: Path: /my%20documents
# A query parameter value that contains special characters
query_value = "100% natural & healthy"
encoded_query_value = quote(query_value)
print(f"URL: ?search={encoded_query_value}")
# Output: URL: ?search=100%25%20natural%20%26%20healthy
# Note: % becomes %25 and & becomes %26

B. `urllib.parse.urlencode()` - For encoding dictionaries of query parameters

This is the most convenient function for building the query string part of a URL (the part after the ). It takes a dictionary (or a sequence of key-value pairs) and returns a properly formatted query string.

Syntax: urllib.parse.urlencode(query, doseq=False, safe='', encoding=None, errors=None, quote_via=quote)

query: A dictionary where keys are parameter names and values are parameter values.
doseq: If True, the values that are sequences (like lists) will be encoded as multiple key-value pairs (e.g., ?a=1&a=2). If False, the sequence is converted to a string (e.g., ?a=[1, 2]).

Example: Encoding a Dictionary

from urllib.parse import urlencode
# A dictionary of query parameters
params = {
    'search': 'python tutorials',
    'category': 'web development',
    'page': 1
}
# Encode the dictionary into a query string
query_string = urlencode(params)
print(f"Full URL: https://example.com/search?{query_string}")
# Output: Full URL: https://example.com/search?search=python+tutorials&category=web+development&page=1
# Note: Spaces are encoded as + by default for query parameters.

Example: Handling Multiple Values for a Single Key

from urllib.parse import urlencode
# A dictionary where one key has a list of values
params = {
    'filter': ['new', 'popular'],
    'sort': 'date'
}
# doseq=True ensures each item in the list gets its own parameter
query_string = urlencode(params, doseq=True)
print(f"URL: ?{query_string}")
# Output: URL: ?filter=new&filter=popular&sort=date

Third-Party Libraries: `requests`

The popular requests library simplifies working with HTTP requests and handles URL encoding for you automatically when you pass parameters. This is often the easiest method in practice.

A. `requests.get()` with `params`

When you use the params argument in a requests function, it automatically takes a dictionary and encodes it into a query string.

Example: Automatic Encoding with requests

import requests
# The dictionary of parameters
params = {
    'q': 'what is a url?',
    'source': 'web',
    'limit': 10
}
# requests automatically encodes the 'params' dictionary
# You don't need to call urlencode() yourself
response = requests.get('https://api.example.com/search', params=params)
print(f"Request URL: {response.url}")
# Output: Request URL: https://api.example.com/search?q=what+is+a+url%3F&source=web&limit=10
# Note: ? becomes %3F and spaces become +

Decoding URLs

You'll also need to decode URLs. The urllib.parse module has unquote() for this.

`urllib.parse.unquote()`

This function converts percent-encoded characters back to their original form.

Example: Decoding a URL

from urllib.parse import unquote
# A URL-encoded string
encoded_string = "Hello%20World%21%20This%20is%20a%20test."
decoded_string = unquote(encoded_string)
print(f"Encoded: {encoded_string}")
print(f"Decoded: {decoded_string}")
# Output:
# Encoded: Hello%20World%21%20This%20is%20a%20test.
# Decoded: Hello World! This is a test.

Summary: Which One to Use?

Use Case	Recommended Function	Why?
Manually building a URL string	`urllib.parse.quote()`	Gives you fine-grained control over which characters are considered "safe".
Building a query string from a dictionary	`urllib.parse.urlencode()`	The standard, clean, and Pythonic way. Handles encoding for you.
Making HTTP requests with parameters	`requests.get(..., params=...)`	The easiest and most common way. It's convenient and less error-prone.
Decoding a URL or part of a URL	`urllib.parse.unquote()`	The standard function for reversing the encoding process.

Complete Example: Building a Full URL

Here is a complete example showing how to build a full, valid URL from parts.

from urllib.parse import quote, urlencode, urlunparse, urlparse
# 1. Define the components of the URL
base_url = "https://www.example.com/search"
search_term = "python urllib tutorial"
filters = ["beginner", "advanced"]
page_number = 2
# 2. Encode the path component (if it had special chars)
# In this case, the base_url is already clean.
# 3. Encode the query parameters
query_params = {
    'q': search_term,
    'filter': filters, # This is a list
    'page': page_number
}
encoded_query = urlencode(query_params, doseq=True)
# 4. Construct the final URL
# urlunparse takes a 6-tuple: (scheme, netloc, path, params, query, fragment)
final_url = urlunparse((
    'https',       # scheme
    'www.example.com', # netloc
    '/search',     # path
    '',            # params (rarely used)
    encoded_query, # query
    ''             # fragment
))
print("--- Manual Construction ---")
print(f"Final URL: {final_url}")
# Output: Final URL: https://www.example.com/search?q=python+urllib+tutorial&filter=beginner&filter=advanced&page=2
# 5. Easier way with requests (recommended for HTTP requests)
import requests
params = {
    'q': search_term,
    'filter': filters,
    'page': page_number
}
requests_url = "https://www.example.com/search"
# requests automatically builds the URL
final_url_requests = requests_url + "?" + urlencode(params)
print("\n--- Using requests ---")
print(f"Final URL: {final_url_requests}")
# Output: Final URL: https://www.example.com/search?q=python+urllib+tutorial&filter=beginner&filter=advanced&page=2

Python中URL编码怎么用？