杰瑞科技汇

Python、Protobuf、C 如何协同工作?

Of course. This is a very common and important topic for building modern, high-performance applications. Let's break down how to use Protocol Buffers (Protobuf) with Python and C++, covering the workflow, best practices, and a complete example.

Python、Protobuf、C 如何协同工作?-图1
(图片来源网络,侵删)

What is Protocol Buffers (Protobuf)?

Protobuf is a method developed by Google for serializing structured data. It's like a more efficient, language-agnostic version of XML or JSON.

Key Advantages:

  • Schema-based: You define your data structure in a .proto file.
  • Language Agnostic: You can generate code for many languages (Python, C++, Java, Go, etc.) from the same .proto file.
  • Performance: The generated code is highly optimized for both speed and size. The binary format is much smaller and faster to parse than text-based formats like JSON.
  • Strictness: The schema enforces data types and structure, catching many errors at compile-time rather than runtime.

The Core Workflow (The 3-Step Process)

No matter which languages you use, the process is always the same:

  1. Define the Schema: Write a .proto file that describes your data structures.
  2. Generate Code: Use the Protobuf compiler (protoc) to generate Python and C++ classes from your .proto file.
  3. Use the Generated Code: In your Python and C++ applications, import the generated modules and use them to serialize/deserialize data.

Step 1: Define the Schema (person.proto)

This is the single source of truth for your data. Let's create a simple person.proto file.

Python、Protobuf、C 如何协同工作?-图2
(图片来源网络,侵删)
// person.proto
syntax = "proto3"; // Use proto3 syntax
package tutorial; // A namespace to avoid name collisions
// The message definition for a Person.
message Person {
  string name = 1;
  int32 id = 2;      // Unique ID for this person
  string email = 3;
  // A nested message for a phone number.
  message PhoneNumber {
    string number = 1;
    enum PhoneType {
      MOBILE = 0;
      HOME = 1;
      WORK = 2;
    }
    PhoneType type = 2;
  }
  repeated PhoneNumber phones = 4; // 'repeated' means this is a list/array
}

Key Concepts:

  • syntax = "proto3";: Specifies the version.
  • package tutorial;: Creates a namespace for the generated code.
  • message Person { ... }: Defines a struct-like object.
  • string name = 1;: name is a field of type string. The number 1 is a unique field tag. It's crucial for the binary format and should not be changed once data is serialized with it.
  • repeated PhoneNumber phones = 4;: Defines a list of PhoneNumber objects.
  • enum PhoneType { ... }: Defines a set of named constants.

Step 2: Prerequisites & Code Generation

Before you can generate code, you need to install the necessary tools for both Python and C++.

Prerequisites

  1. Install the Protobuf Compiler (protoc):

    • macOS (using Homebrew): brew install protobuf
    • Ubuntu/Debian: sudo apt-get install protobuf-compiler
    • Windows (using vcpkg): vcpkg install protobuf
    • Pre-compiled Binaries: You can also download binaries directly from the GitHub Releases page.
  2. Install Python Protobuf Library:

    Python、Protobuf、C 如何协同工作?-图3
    (图片来源网络,侵删)
    pip install protobuf
  3. Install C++ Protobuf Library (Headers and Libraries):

    • macOS (using Homebrew): brew install protobuf
    • Ubuntu/Debian: sudo apt-get install libprotobuf-dev
    • Windows (using vcpkg): vcpkg install protobuf

Generate the Code

Now, run the protoc compiler. It's best practice to create a separate directory for the generated code, e.g., python_pb and cpp_pb.

# Create directories for generated code
mkdir python_pb cpp_pb
# Generate Python code
# The --python_out flag specifies the output directory for Python files.
# It will create person_pb2.py.
protoc --python_out=python_pb person.proto
# Generate C++ code
# The --cpp_out flag specifies the output directory for C++ files.
# It will create person.pb.h and person.pb.cc.
protoc --cpp_out=cpp_pb person.proto

After running these commands, you will have:

  • python_pb/person_pb2.py
  • cpp_pb/person.pb.h (header file)
  • cpp_pb/person.pb.cc (source file)

Step 3: Use the Generated Code

Now let's see how to use these generated modules in Python and C++.

Example in Python

Create a file write_read_python.py:

import sys
import os
# Add the directory containing the generated module to the Python path
sys.path.append(os.path.join(os.path.dirname(__file__), 'python_pb'))
from tutorial import person_pb2 # Import the generated module
def main():
    # --- 1. Create and populate a Person object ---
    person = person_pb2.Person()
    person.name = "Jane Doe"
    person.id = 12345
    person.email = "jane.doe@example.com"
    # Add a phone number
    phone = person.phones.add()
    phone.number = "555-1234"
    phone.type = person_pb2.PhoneType.HOME
    # Add another phone number
    phone = person.phones.add()
    phone.number = "555-5678"
    phone.type = person_pb2.PhoneType.WORK
    # --- 2. Serialize the object to a byte string ---
    serialized_data = person.SerializeToString()
    print(f"Serialized data (bytes): {serialized_data}")
    print(f"Serialized data (hex): {serialized_data.hex()}\n")
    # --- 3. Deserialize the byte string back into a Person object ---
    new_person = person_pb2.Person()
    new_person.ParseFromString(serialized_data)
    # --- 4. Verify the deserialized data ---
    print("Deserialized Person:")
    print(f"  Name: {new_person.name}")
    print(f"  ID: {new_person.id}")
    print(f"  Email: {new_person.email}")
    for phone_number in new_person.phones:
        print(f"  Phone: {phone_number.number} (Type: {phone_number.type})")
if __name__ == '__main__':
    main()

To run the Python script:

python write_read_python.py

Example in C++

C++ is more verbose. You'll need to link against the libprotobuf library.

First, create a CMakeLists.txt file to manage the build process easily:

cmake_minimum_required(VERSION 3.10)
project(ProtobufCPlusPlusExample)
# Find the Protobuf package
find_package(protobuf CONFIG REQUIRED)
# Add the executable
add_executable(write_read_cpp write_read_cpp.cpp)
# Link the protobuf library
target_link_libraries(write_read_cpp PRIVATE protobuf::libprotobuf)
# Include the directory where our generated header is
target_include_directories(write_read_cpp PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/cpp_pb)

Now, create the C++ source file write_read_cpp.cpp:

#include <iostream>
#include <fstream>
#include <string>
// Include the generated header
#include "tutorial/person.pb.h"
// For convenience, import the whole namespace
using namespace tutorial;
void SerializeToFile(const Person& person, const std::string& filename) {
    std::fstream output(filename, std::ios::out | std::ios::binary);
    if (!person.SerializeToOstream(&output)) {
        std::cerr << "Failed to write person." << std::endl;
    }
}
bool ParseFromFile(const std::string& filename, Person* person) {
    std::fstream input(filename, std::ios::in | std::ios::binary);
    if (!person->ParseFromIstream(&input)) {
        std::cerr << "Failed to parse person." << std::endl;
        return false;
    }
    return true;
}
int main() {
    // --- 1. Create and populate a Person object ---
    Person person;
    person.set_name("John Doe");
    person.set_id(67890);
    person.set_email("john.doe@example.com");
    // Add a phone number
    Person::PhoneNumber* phone = person.add_phones();
    phone->set_number("555-9876");
    phone->set_type(Person::PhoneType::HOME);
    // --- 2. Serialize the object to a file ---
    std::string filename = "person.dat";
    SerializeToFile(person, filename);
    std::cout << "Serialized data to " << filename << std::endl << std::endl;
    // --- 3. Deserialize the file back into a Person object ---
    Person new_person;
    if (!ParseFromFile(filename, &new_person)) {
        return 1; // Exit if parsing failed
    }
    // --- 4. Verify the deserialized data ---
    std::cout << "Deserialized Person:" << std::endl;
    std::cout << "  Name: " << new_person.name() << std::endl;
    std::cout << "  ID: " << new_person.id() << std::endl;
    std::cout << "  Email: " << new_person.email() << std::endl;
    for (const auto& phone_number : new_person.phones()) {
        std::cout << "  Phone: " << phone_number.number() 
                  << " (Type: " << phone_number.type() << ")" << std::endl;
    }
    // Clean up the generated .cc file if you wish
    // std::remove("person.dat");
    return 0;
}

To compile and run the C++ code:

# Create a build directory
mkdir build
cd build
# Configure with CMake
cmake ..
# Build the project
make
# Run the executable
./write_read_cpp

Summary and Key Takeaways

Feature Python C++
Generated Module person_pb2.py (Python module) person.pb.h (header) & person.pb.cc (source)
Include/Import from tutorial import person_pb2 #include "tutorial/person.pb.h"
Object Creation person = person_pb2.Person() Person person;
Field Assignment person.name = "..." person.set_name("...");
Field Access person.name person.name() (accessor)
Repeated Fields person.phones.add() to add items person.add_phones() to add items
Serialization data = person.SerializeToString() person.SerializeToString(&data)
Deserialization new_person.ParseFromString(data) new_person.ParseFromString(&data)
Building pip install protobuf find_package(protobuf) in CMake, link libprotobuf

This workflow allows you to define your data model once and share it seamlessly between a high-level language like Python (for scripting, web backends, data analysis) and a high-performance language like C++ (for game engines, embedded systems, or performance-critical components). The data remains consistent and the communication is fast and efficient.

分享:
扫描分享到社交APP
上一篇
下一篇