Python字符串与Bytes如何转换？-杰瑞科技汇

Of course! This is a fundamental and crucial concept in Python, especially when dealing with network programming, file I/O, or low-level data.

（图片来源网络，侵删）

Let's break down the difference between str (string) and bytes in Python.

The Core Difference: Text vs. Bytes

str (String): Represents text. It's a sequence of Unicode characters. Unicode is a universal standard that assigns a unique number (a code point) to every character from every language (e.g., A, , 你, ). A str object is abstract and doesn't care about how the characters are actually stored in memory or on disk.
bytes (Bytes): Represents raw binary data. It's a sequence of 8-bit values (integers from 0 to 255). Bytes are the actual "bytes" that are stored in memory, sent over a network, or written to a file. They have no inherent meaning; they just are.

Analogy: Think of str as a book written in a language you understand. The words and sentences have meaning. Think of bytes as the physical ink and paper the book is printed on. The ink itself is just a pattern; it only becomes meaningful when you know how to read it (i.e., which encoding to use).

Converting Between `str` and `bytes`

The process of converting between them is called encoding and decoding.

Encoding: Converting a str to bytes. You specify an encoding (like UTF-8, ASCII) to define how each Unicode character should be represented as a sequence of bytes.
Decoding: Converting bytes to a str. You must use the same encoding that was used to create the bytes to correctly interpret them back into text.

Encoding: `str` -> `bytes`

You use the .encode() method on a string.

（图片来源网络，侵删）

# Our original string
my_string = "Hello, 世界! 👋"
# Encode the string into bytes using UTF-8 (the most common encoding)
my_bytes = my_string.encode('utf-8')
print(f"Original string: {my_string}")
print(f"Type of original: {type(my_string)}")
print("-" * 20)
print(f"Encoded bytes: {my_bytes}")
print(f"Type of encoded: {type(my_bytes)}")
# You can see the raw bytes. Note that non-ASCII characters take up more than one byte.
# 'H' -> b'H'
# ' ' -> b' '
# '世' -> b'\xe4\xb8\x96' (3 bytes)

Output:

Original string: Hello, 世界! 👋
Type of original: <class 'str'>
--------------------
Encoded bytes: b'Hello, \xe4\xb8\x96\xe7\x95\x8c! \xf0\x9f\x91\x8b'
Type of encoded: <class 'bytes'>

Decoding: `bytes` -> `str`

You use the .decode() method on a bytes object.

# We have the bytes from the previous example
my_bytes = b'Hello, \xe4\xb8\x96\xe7\x95\x8c! \xf0\x9f\x91\x8b'
# Decode the bytes back into a string using UTF-8
recovered_string = my_bytes.decode('utf-8')
print(f"Original bytes: {my_bytes}")
print(f"Type of original: {type(my_bytes)}")
print("-" * 20)
print(f"Decoded string: {recovered_string}")
print(f"Type of decoded: {type(recovered_string)}")

Output:

Original bytes: b'Hello, \xe4\xb8\x96\xe7\x95\x8c! \xf0\x9f\x91\x8b'
Type of original: <class 'bytes'>
--------------------
Decoded string: Hello, 世界! 👋
Type of decoded: <class 'str'>

What if you use the wrong encoding? This is a very common source of errors.

（图片来源网络，侵删）

# Try to decode UTF-8 bytes using ASCII
# ASCII only covers characters from 0-127. The byte \xe4 is outside this range.
try:
    my_bytes.decode('ascii')
except UnicodeDecodeError as e:
    print(f"Error: {e}")

Output:

Error: 'ascii' codec can't decode byte 0xe4 in position 7: ordinal not in range(128)

Why is this important? (Common Use Cases)

You can't just mix str and bytes in operations. Python will raise a TypeError.

# This will FAIL!
# my_string + my_bytes  # TypeError: can only concatenate str (not "bytes") to str

Here’s where you need to be careful:

Reading from and Writing to Files

When you open a file, you must specify whether you're working with text ('r', 'w') or binary ('rb', 'wb').

Text Mode (Default): Python automatically handles encoding/decoding for you. By default, it uses the system's encoding, but it's best to be explicit.

# Writing a string to a file in text mode
with open('my_file.txt', 'w', encoding='utf-8') as f:
    f.write("Hello, 世界!")
# Reading from a file in text mode
with open('my_file.txt', 'r', encoding='utf-8') as f:
    content = f.read()  # content is a 'str'
    print(f"Read from file: {content} (type: {type(content)})")

Binary Mode: You work directly with bytes.

# Writing a string to a file in binary mode (you must encode it first)
my_string = "Hello, 世界!"
with open('my_file_bytes.bin', 'wb') as f:
    f.write(my_string.encode('utf-8'))
# Reading from a file in binary mode (you must decode it)
with open('my_file_bytes.bin', 'rb') as f:
    data = f.read()  # data is 'bytes'
    print(f"Read from file: {data} (type: {type(data)})")
    # Decode it to get the string back
    decoded_content = data.decode('utf-8')
    print(f"Decoded content: {decoded_content} (type: {type(decoded_content)})")

Network Communication

When you send data over a network (e.g., via a socket), it must be sent as a stream of bytes. Any string you send must first be encoded.

# Sending data (conceptual)
import socket
# Create a socket (server side)
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.bind(('127.0.0.1', 65432))
server_socket.listen()
conn, addr = server_socket.accept()
# Receive data as bytes
data_from_client = conn.recv(1024)
print(f"Received bytes: {data_from_client}")
# Decode the bytes to get the string
message_from_client = data_from_client.decode('utf-8')
print(f"Decoded message: {message_from_client}")
# Send a response (must encode the string first)
response = "Message received!"
conn.sendall(response.encode('utf-8'))
conn.close()

Summary Table

Feature	`str` (String)	`bytes` (Bytes)
Purpose	Represents text.	Represents raw binary data.
Content	Sequence of Unicode characters (e.g., 'A', '你', '😂').	Sequence of integers from 0 to 255.
Literal Syntax	`'text'` or `"text"`	`b'bytes'` or `b"bytes"`
Methods	`.find()`, `.replace()`, `.split()`, `.format()`, `.encode()`	`.hex()`, `.fromhex()`, `.decode()`
Common Use	Storing and manipulating text, user input, file I/O in text mode.	Network communication, reading/writing binary files (images, executables), cryptography.
Conversion	`my_str.encode('utf-8')` -> `bytes`	`my_bytes.decode('utf-8')` -> `str`

Python字符串与Bytes如何转换？

The Core Difference: Text vs. Bytes

Converting Between `str` and `bytes`

Encoding: `str` -> `bytes`

Decoding: `bytes` -> `str`

Why is this important? (Common Use Cases)

Reading from and Writing to Files

Network Communication

Summary Table

99ANYc3cd6

Python安装MySQL模块报错怎么办？

Ubuntu 16.04教程怎么学？

Python如何连接MySQL数据库？

Java调用DLL是多线程的吗？

Java中JSON字符串转Map有哪些方法？

Python自然语言处理如何高效处理PDF文档？

u盘装win7系统步骤复杂吗？

Photoshop合成怎么学？新手入门技巧有哪些？

Java byte 转 byte 有何注意事项？

Java ArrayList如何排序？

mac与Ubuntu如何安装？步骤有哪些？

马士兵Java视频教程哪里能下载？

Ubuntu双系统安装步骤复杂吗？

如何用Java将HTML转为Word文档？

Java iterator for循环怎么用？

Python中range与xrange到底有啥区别？

Python字符串与Bytes如何转换？

The Core Difference: Text vs. Bytes

Converting Between str and bytes

Encoding: str -> bytes

Decoding: bytes -> str

Why is this important? (Common Use Cases)

Reading from and Writing to Files

Network Communication

Summary Table

相关推荐

Python自然语言处理如何高效处理PDF文档？

Converting Between `str` and `bytes`

Encoding: `str` -> `bytes`

Decoding: `bytes` -> `str`