Python中URL转义如何实现？-杰瑞科技汇

Of course! In Python, URL escaping (also known as URL encoding) is the process of converting special characters into a followed by a two-digit hexadecimal code. This ensures that URLs only contain a specific set of allowed characters.

（图片来源网络，侵删）

Here’s a comprehensive guide covering the modern and recommended approach, as well as the older, legacy method.

The Modern and Recommended Approach: `urllib.parse`

For any Python 3 code, you should use the urllib.parse module. It's the standard, well-documented, and robust way to handle URL manipulation.

Escaping (URL Encoding) with `quote()`

The urllib.parse.quote() function is used to encode a string for a specific URL component. It takes a string and replaces special characters with escapes.

Key Points:

（图片来源网络，侵删）

quote(string, safe=''): The safe parameter is crucial. It specifies characters that should not be encoded. The default is , which is common for paths.
It encodes spaces into %20 by default. If you want spaces to become signs (common for query parameters), use quote_plus().

Example: Encoding a Path Component

Let's say you have a filename with spaces and special characters.

from urllib.parse import quote
# A string that is not URL-safe
path_segment = "my folder/file name.html?param=value"
# Encode it for use in a URL path
# The 'safe=' parameter tells quote() not to encode slashes '/'
encoded_path = quote(path_segment, safe='/')
print(f"Original: {path_segment}")
print(f"Encoded:  {encoded_path}")
# Output:
# Original: my folder/file name.html?param=value
# Encoded:  my%20folder/file%20name.html%3Fparam%3Dvalue

Notice how ` (space) becomes%20?becomes%3F, and=becomes%3D. The/was preserved because we included it in thesafe` string.

Encoding Query Parameters with `quote_plus()`

For query strings (the part after ), it's standard practice to encode spaces as signs. The quote_plus() function does this automatically.

（图片来源网络，侵删）

from urllib.parse import quote_plus
query_string = "search term with spaces & symbols"
# Encode it for use in a URL query string
encoded_query = quote_plus(query_string)
print(f"Original: {query_string}")
print(f"Encoded:  {encoded_query}")
# Output:
# Original: search term with spaces & symbols
# Encoded:  search+term+with+spaces+%26+symbols

Here, spaces become and & becomes %26.

The Reverse: Unescaping (URL Decoding)

To decode an escaped URL string back to its original form, you use unquote() or unquote_plus().

Decoding with `unquote()`

This function replaces escapes with their corresponding characters. It also converts signs back to spaces.

from urllib.parse import unquote
encoded_url = "https://example.com/my%20folder/search+term.html%3Fid=123"
# Decode the URL
decoded_url = unquote(encoded_url)
print(f"Encoded:   {encoded_url}")
print(f"Decoded:   {decoded_url}")
# Output:
# Encoded:   https://example.com/my%20folder/search+term.html%3Fid=123
# Decoded:   https://example.com/my folder/search term.html?id=123

Decoding with `unquote_plus()`

This is specifically for query strings. It does the same as unquote() but also explicitly converts signs to spaces, which is good practice for query parameters.

from urllib.parse import unquote_plus
encoded_query = "query=hello+world&filter=new%26used"
# Decode the query string
decoded_query = unquote_plus(encoded_query)
print(f"Encoded:   {encoded_query}")
print(f"Decoded:   {decoded_query}")
# Output:
# Encoded:   query=hello+world&filter=new%26used
# Decoded:   query=hello world&filter=new&used

Complete Example: Building a Safe URL

Let's combine these concepts to build a complete, safe URL from its parts.

from urllib.parse import quote, urljoin
base_url = "https://www.example.com/search"
search_term = "python urllib & encoding"
page_number = 2
# 1. Encode the search term for the query string
# Use quote_plus because it's a query parameter value
encoded_term = quote_plus(search_term)
# 2. Construct the full query string
# Note: We manually join the parameters with '&'
query_string = f"q={encoded_term}&page={page_number}"
# 3. Join the base URL and the query string
# urljoin is a robust way to handle joining URLs
final_url = urljoin(base_url + "?", query_string)
print(f"Base URL: {base_url}")
print(f"Query String: q={search_term}&page={page_number}")
print(f"Final Safe URL: {final_url}")
# Output:
# Base URL: https://www.example.com/search
# Query String: q=python urllib & encoding&page=2
# Final Safe URL: https://www.example.com/search?q=python+urllib+%26+encoding&page=2

The Legacy Approach: `urllib` (Python 2)

If you are maintaining very old Python 2 code, you might see this. This is not recommended for new Python 3 code.

In Python 2, urllib and urllib2 were separate modules. The escaping functions were in urllib.

# Python 2 Example (DO NOT USE IN PYTHON 3)
import urllib
# URL encoding
encoded = urllib.quote("hello world")
print(encoded)  # Output: hello%20world
# URL decoding with plus-to-space conversion
decoded = urllib.unquote_plus("hello+world")
print(decoded)  # Output: hello world

Why avoid it in Python 3?

In Python 3, urllib was refactored. The old functions were moved to urllib.parse.quote, urllib.parse.unquote, etc., for better organization.
The old urllib.quote and urllib.quote_plus were not as robust or clear as their urllib.parse counterparts.

Summary: Which Function to Use?

Task	Recommended Function (Python 3)	When to Use
Encode a path	`urllib.parse.quote(string, safe='/')`	For URL paths, directories, filenames. Preserves .
Encode a query value	`urllib.parse.quote_plus(string)`	For query parameters. Converts spaces to .
Decode a URL	`urllib.parse.unquote(string)`	General purpose decoding. Converts to space.
Decode a query string	`urllib.parse.unquote_plus(string)`	Decoding query parameters. Explicitly handles .

Python中URL转义如何实现？

The Modern and Recommended Approach: `urllib.parse`

Escaping (URL Encoding) with `quote()`

Encoding Query Parameters with `quote_plus()`

The Reverse: Unescaping (URL Decoding)

Decoding with `unquote()`

Decoding with `unquote_plus()`

Complete Example: Building a Safe URL

The Legacy Approach: `urllib` (Python 2)

Summary: Which Function to Use?

99ANYc3cd6

Excel宏视频教程，如何快速上手？

Python textwrap模块如何实现文本自动换行？

因子分析如何操作？

ASP.NET经典教程适合零基础入门吗？

Python template模块如何高效使用？

Java Socket编程实例具体怎么实现？

ActiveState Python与官方Python有何不同？

联想B8000F如何刷机？教程步骤详解

何洁月C程序设计视频教程哪里看？

oracle10g安装教程

ubuntu 16.04 java

python pip安装 linux

Python中unsequeeze函数的作用是什么？

Flash CS6教程视频哪里能看？

Java如何打开Project文件？

catia2025安装教程

Python中URL转义如何实现？

The Modern and Recommended Approach: urllib.parse

Escaping (URL Encoding) with quote()

Encoding Query Parameters with quote_plus()

The Reverse: Unescaping (URL Decoding)

Decoding with unquote()

Decoding with unquote_plus()

Complete Example: Building a Safe URL

The Legacy Approach: urllib (Python 2)

Summary: Which Function to Use?

相关推荐

Java Socket编程实例具体怎么实现？

The Modern and Recommended Approach: `urllib.parse`

Escaping (URL Encoding) with `quote()`

Encoding Query Parameters with `quote_plus()`

Decoding with `unquote()`

Decoding with `unquote_plus()`

The Legacy Approach: `urllib` (Python 2)