杰瑞科技汇

Python struct与socket如何高效处理二进制数据?

Of course! The combination of Python's struct and socket modules is fundamental for network programming. They work together to solve a core problem: how to send data over a network in a predictable, byte-oriented way.

Python struct与socket如何高效处理二进制数据?-图1
(图片来源网络,侵删)

Let's break down what each module does and then see how they work together.


The Problem: Why Do We Need Both?

Networks, like the internet, are fundamentally streams of bytes. They don't understand Python data types like integers, strings, or lists. If you try to send a Python integer directly over a socket, you'll run into errors.

  • Problem 1: How do you send the integer 12345? Is it 5 bytes? 2 bytes? The receiver won't know.
  • Problem 2: How do you send a string? What encoding is it (UTF-8, ASCII)? How long is it? The receiver needs to know how many bytes to read.
  • Problem 3: How do you send multiple pieces of data (e.g., an integer and a string together)? The receiver needs to know where one piece ends and the next begins.

This is where struct and socket come in.

  • struct is the packager. It converts Python data types into a well-defined, binary format (a bytes object).
  • socket is the shipper. It provides the tools (send(), recv()) to send and receive raw bytes over the network.

Analogy: Think of shipping a complex gift.

Python struct与socket如何高效处理二进制数据?-图2
(图片来源网络,侵删)
  • struct: You carefully pack your items (the data) into a standard-sized box (the bytes object) using bubble wrap and foam (the format specifiers). You write on the box "2 Books, 1 Mug" (the format string).
  • socket: You hand the packed box to the postal service (the network) to deliver it. The postal service only cares about the box, not what's inside.
  • The receiver uses struct again to unpack the box, knowing exactly how to interpret the contents based on the label on the outside.

The struct Module: Packing and Unpacking

The struct module's main functions are pack() and unpack().

Format Specifiers

These are strings that define how to convert data. The most common ones are:

Character Type Size (bytes) Example
c char b'x'
b signed char 1 b'\xff'
B unsigned char 1 255
h short 2 10
H unsigned short 2 10
i int 4 100
I unsigned int 4 100
f float 4 14
d double 8 718
s bytes variable b'hello'
p pascal string variable
q long long 8 123456789
Q unsigned long long 8 123456789

You can also use prefixes:

  • <: little-endian (most common on x86/x64 processors)
  • >: big-endian
  • network byte order (same as big-endian)

struct.pack(format, v1, v2, ...)

Converts values into a bytes object according to the format string.

Python struct与socket如何高效处理二进制数据?-图3
(图片来源网络,侵删)
import struct
# Pack an integer and a float into a bytes object
# Format: '>if' -> Big-endian unsigned int, then big-endian float
data = struct.pack('>if', 42, 3.14159)
print(f"Packed data: {data}")
print(f"Length of packed data: {len(data)}") # Should be 4 + 4 = 8 bytes
print(f"Data in hex: {data.hex()}") # Easier to see the bytes

struct.unpack(format, data)

Unpacks a bytes object into a tuple of values.

import struct
# The data we received from the network
received_data = b'\x00\x00\x2a\x40\x09\x21\x09\x3f'
# Unpack it using the same format
unpacked_tuple = struct.unpack('>if', received_data)
print(f"Unpacked tuple: {unpacked_tuple}") # (42, 3.1415927496032715)
original_int, original_float = unpacked_tuple
print(f"Original integer: {original_int}")
print(f"Original float: {original_float}")

The socket Module: The Network Connection

The socket module is Python's interface to the BSD socket interface. It's how you create connections, send data, and receive data.

Key Concepts:

  • Socket: An endpoint for sending or receiving data across a computer network.
  • IP Address: The address of a machine on the network (e.g., 0.0.1 for localhost).
  • Port: A number that identifies a specific process or service on a machine (e.g., 80 for HTTP, 443 for HTTPS). Ports range from 0 to 65535.
  • Connection: A communication link between two sockets (one on the client, one on the server).

Basic Flow:

  1. Server:

    • socket.socket(): Create a socket.
    • bind(): Attach the socket to an IP address and port.
    • listen(): Start listening for incoming connections.
    • accept(): Block and wait for a client to connect. Returns a new socket object for communication with that client.
    • recv(): Receive data from the client.
    • sendall(): Send data back to the client.
    • close(): Close the connection.
  2. Client:

    • socket.socket(): Create a socket.
    • connect(): Connect to the server's IP address and port.
    • sendall(): Send data to the server.
    • recv(): Receive data from the server.
    • close(): Close the connection.

Putting It All Together: A Complete Example

Here is a simple echo server and client. The client sends a packed integer and a string, and the server unpacks them, prints them, and sends them back.

The Server (server.py)

import socket
import struct
# Use a context manager to ensure the socket is closed automatically
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    # AF_INET = IPv4, SOCK_STREAM = TCP
    s.bind(('127.0.0.1', 65432))  # Bind to localhost on port 65432
    s.listen()
    print("Server listening on 127.0.0.1:65432")
    conn, addr = s.accept()
    with conn:
        print(f"Connected by {addr}")
        while True:
            # 1. Receive the size of the packed data first
            # We'll pack the size as a 4-byte unsigned integer ('I')
            size_data = conn.recv(4)
            if not size_data:
                break # Connection closed by client
            # 2. Unpack the size
            size = struct.unpack('>I', size_data)[0]
            print(f"Expecting {size} bytes of data...")
            # 3. Receive the actual data
            data = conn.recv(size)
            if not data:
                break
            # 4. Unpack the data
            # Format: >Iis -> Big-endian int, big-endian float, then a string (pascal-style)
            # The 'is' part: 'i' for int, 's' for string. The 's' needs a length.
            # We know the string length is size - 4 (for int) - 4 (for float) = 8
            # A better way is to pack the string length first.
            # Let's use a simpler format: >Iis (int, float, string)
            # We'll assume the string is null-terminated or has a fixed length.
            # For this example, let's pack the string length.
            # Let's use a more robust format: >Iis (int, float, string)
            # The 's' in struct requires a length. Let's get the string length.
            # Total size = 4 (int) + 4 (float) + len(string)
            # So, len(string) = size - 8
            # A better approach is to pack the length of the string.
            # Let's use: >Iis (int, float, string) where the 's' is null-terminated.
            # struct.pack('is', 10
分享:
扫描分享到社交APP
上一篇
下一篇