杰瑞科技汇

Python如何打开二进制文件?

Of course! In Python, opening a file in "binary mode" is a fundamental operation for handling non-text data like images, audio, video, executables, and network packets.

Python如何打开二进制文件?-图1
(图片来源网络,侵删)

The key is to use the 'b' character in the file mode string.

The Core Concept: Text vs. Binary

  • Text Mode (default): Python handles the encoding and decoding of characters (e.g., converting bytes to str). This is what you use for .txt, .csv, .json, .py files. It's platform-dependent regarding line endings (\n vs. \r\n).
  • Binary Mode: Python reads and writes raw bytes without any interpretation or conversion. This is essential for files that are not meant to be read as text.

The Basic Syntax

You open a binary file by appending 'b' to the mode string.

# 'rb' for read binary
# 'wb' for write binary
# 'ab' for append binary
# 'rb+' for read and write binary

Important: When you open a file in binary mode, the file object you get back works with bytes objects, not str objects.


Reading a Binary File ('rb')

Let's say you have an image file named my_image.png.

Python如何打开二进制文件?-图2
(图片来源网络,侵删)
# The file 'my_image.png' must exist in the same directory
# for this example to work.
try:
    # 1. Open the file in read-binary mode ('rb')
    with open('my_image.png', 'rb') as f:
        # 2. Read the entire file content
        # The variable 'image_data' will be a 'bytes' object
        image_data = f.read()
    # 3. Now, 'image_data' contains the raw bytes of the image
    print(f"Successfully read the file.")
    print(f"Type of the data: {type(image_data)}")
    print(f"Size of the file in bytes: {len(image_data)}")
    # You can inspect the first few bytes
    print("\nFirst 20 bytes of the file:")
    print(image_data[:20])
except FileNotFoundError:
    print("Error: The file 'my_image.png' was not found.")
except IOError as e:
    print(f"An I/O error occurred: {e}")

Output:

Successfully read the file.
Type of the data: <class 'bytes'>
Size of the file in bytes: 15312
First 20 bytes of the file:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00'

Notice how the output is b'...' and contains non-printable characters like \x89. This is the raw data of the PNG file.


Writing to a Binary File ('wb')

When writing, you must provide a bytes object. If you have a string, you must first encode it into bytes.

# Data to write (a simple PNG header)
# We'll create a tiny, invalid PNG file just for demonstration
png_header_bytes = b'\x89PNG\r\n\x1a\n' # A valid PNG signature
try:
    # 1. Open the file in write-binary mode ('wb')
    # If the file 'output.png' already exists, it will be overwritten.
    with open('output.png', 'wb') as f:
        # 2. Write the bytes object to the file
        f.write(png_header_bytes)
    print("Successfully wrote to 'output.png'")
    # Let's verify it by reading it back
    with open('output.png', 'rb') as f:
        read_data = f.read()
        print(f"Data read back: {read_data}")
        print(f"Does it match? {read_data == png_header_bytes}")
except IOError as e:
    print(f"An I/O error occurred: {e}")

Output:

Successfully wrote to 'output.png'
Data read back: b'\x89PNG\r\n\x1a\n'
Does it match? True

Common Pitfalls and Key Differences

Pitfall 1: Forgetting to Encode a String

This is the most common mistake when starting with binary files.

# --- INCORRECT ---
text_to_write = "Hello, World!"
try:
    with open('mistake.txt', 'wb') as f:
        # This will raise a TypeError because 'wb' mode expects bytes
        f.write(text_to_write)
except TypeError as e:
    print(f"Error: {e}")
    # Error: a bytes-like object is required, not 'str'

The Solution: Encode the String

You must explicitly tell Python how to convert the string to bytes using an encoding like 'utf-8'.

# --- CORRECT ---
text_to_write = "Hello, World!"
# Encode the string to bytes using UTF-8 encoding
text_bytes = text_to_write.encode('utf-8')
try:
    with open('correct.txt', 'wb') as f:
        f.write(text_bytes)
    print("Successfully wrote the encoded string.")
except IOError as e:
    print(f"An I/O error occurred: {e}")

Pitfall 2: Mixing Modes

You cannot read a file opened in 'wb' mode, and you cannot write to a file opened in 'rb' mode.


When to Use Binary Mode

Use binary mode ('rb', 'wb', etc.) when dealing with:

  • Image files (.png, .jpg, .gif, .svg)
  • Audio files (.mp3, .wav, .flac)
  • Video files (.mp4, .avi, .mov)
  • Executable files (.exe, .dll, .bin)
  • Compressed archives (.zip, .tar, .gz)
  • Network data packets
  • Custom file formats where the data is structured in bytes.
  • Any file that contains non-textual data or has a specific binary structure.

Summary Table

Mode Description Data Type Handled Example Use Case
'r' Read Text (default) str Reading a .txt or .csv file.
'w' Write Text str Creating a new log file.
'a' Append Text str Adding a new entry to a log file.
'rb' Read Binary bytes Reading an image, video, or executable.
'wb' Write Binary bytes Creating a new image or saving a model.
'ab' Append Binary bytes Appending data to a binary log file.
'r+' Read & Write Text str Modifying a text file in place.
'rb+' Read & Write Binary bytes Modifying a binary file in place.
分享:
扫描分享到社交APP
上一篇
下一篇