杰瑞科技汇

Flatbuffers Python如何高效使用?

Of course! Here is a comprehensive guide to using FlatBuffers in Python, covering the basics, a step-by-step example, and best practices.

Flatbuffers Python如何高效使用?-图1
(图片来源网络,侵删)

What is FlatBuffers?

FlatBuffers is a cross-platform serialization library for C++, C#, Go, Java, JavaScript, TypeScript, Lobster, Lua, PHP, Python, and Rust. It was created by Google for use in games and other high-performance applications.

The key difference from other serialization formats (like JSON or Protocol Buffers) is its "zero-copy" nature.

  • Traditional Approach (e.g., JSON, Protobuf): You serialize data into a buffer, then you parse the buffer to create an in-memory object tree (e.g., a dictionary or a class instance). Accessing data requires navigating this tree, which can be slow and memory-intensive.
  • FlatBuffers Approach: You serialize data directly into a single, memory-mappable buffer. You then access the data directly from the buffer without any parsing or unpacking step. There is no memory overhead for a secondary object tree.

Advantages of FlatBuffers:

  • Speed: Extremely fast both for serialization and, more importantly, deserialization. You just read from the buffer like an array.
  • Low Memory Footprint: No need to allocate memory for a secondary object tree. The buffer itself is the data.
  • Memory-Mappable: The buffer can be stored on disk or sent over the network and accessed directly from memory, which is perfect for game engines and embedded systems.
  • Schema Evolution: Like Protocol Buffers, you can evolve your data schema over time without breaking old readers.

Step-by-Step Python Example

Let's create a simple example where we serialize a list of Monster objects and then read them back.

Flatbuffers Python如何高效使用?-图2
(图片来源网络,侵删)

Step 1: Define the Schema (.fbs file)

First, you need to define your data structure using the FlatBuffers schema language. Create a file named monster.fbs:

// monster.fbs
namespace mygame;
// An enum for the monster's color.
enum Color : byte { Red = 0, Green = 1, Blue = 2 }
// A monster's name, represented as an array of bytes.
// The `@deprecated` attribute marks this field for removal.
table Monster {
  name:string;
  pos:[Vec3];
  mana:short = 150;
  hp:ushort = 100;
  color:Color = Blue;
  friendly:bool = false (deprecated);
  inventory:[ubyte];
  equipped_type:Equipment;
  path:[Vec3];
}
// A vector of 3 floats.
table Vec3 {
  x:float;
  y:float;
  z:float;
}
enum Equipment : byte { None = 0, Weapon = 1, Shield = 2 }
// The root type is what the top-level buffer will contain.
// The file name must match the root type.
root_type Monster;

Step 2: Generate Python Code

Now, use the FlatBuffers compiler (flatc) to generate Python classes from your schema.

  1. Install the FlatBuffers compiler:

    • From source: Follow the instructions on the FlatBuffers GitHub page.
    • Using a package manager (e.g., on Ubuntu/Debian):
      sudo apt-get install flatbuffers
  2. Run the compiler: Navigate to the directory containing monster.fbs and run:

    Flatbuffers Python如何高效使用?-图3
    (图片来源网络,侵删)
    flatc --python monster.fbs

    This will create a new directory named mygame containing the Python files:

    • __init__.py
    • monster.py
    • vec3.py
    • monster_builder.py

    The *_builder.py files are used to create serialized buffers. The other files (monster.py, vec3.py) contain the "reader" classes that allow you to access the data from a buffer without parsing.

Step 3: Write the Python Code to Serialize

Create a Python script named build_monster.py to build and serialize the data.

# build_monster.py
import flatbuffers
from mygame import Monster, Vec3, Color, Equipment
# 1. Initialize a Builder
builder = flatbuffers.Builder(1024)  # Initial buffer size
# 2. Create the data (in reverse order, from the bottom up)
# Create Vec3 objects for the path
path1 = Vec3.CreateVec3(builder, 1.0, 2.0, 3.0)
path2 = Vec3.CreateVec3(builder, 4.0, 5.0, 6.0)
path = [path1, path2]
# Create the inventory array
inventory = [0, 1, 2, 3, 4]
# Create the position Vec3
pos = Vec3.CreateVec3(builder, 1.0, 2.0, 3.0)
# Create the name string
name = builder.CreateString("MyMonster")
# Start building the Monster object
Monster.StartMonster(builder)
Monster.AddName(builder, name)
Monster.AddPos(builder, pos)
Monster.AddColor(builder, Color.Blue)
Monster.AddHp(builder, 80)
Monster.AddMana(builder, 150)
Monster.AddInventory(builder, inventory)
Monster.AddPath(builder, path)
Monster.AddEquippedType(builder, Equipment.Weapon)
# Finish the Monster object
monster_offset = Monster.EndMonster(builder)
# 3. Finish the buffer
# The root of the buffer is our Monster object.
builder.Finish(monster_offset)
# 4. Get the serialized buffer as a bytes object
buf = builder.Output()
# You can now save this buffer to a file or send it over the network
with open('monsterdata.bin', 'wb') as f:
    f.write(buf)
print("Successfully built monsterdata.bin")

Step 4: Write the Python Code to Deserialize (Read)

Now, create a second script named read_monster.py to read the data from the binary file. Notice how simple and direct this is.

# read_monster.py
import os
from mygame import Monster, Color, Equipment
# 1. Read the buffer from the file
buf = None
with open('monsterdata.bin', 'rb') as f:
    buf = f.read()
# 2. Get the root object from the buffer
# This is the "zero-copy" magic! No parsing, just direct access.
monster = Monster.Monster.GetRootAsMonster(buf, 0)
# 3. Access the data using the generated methods
print(f"Name: {monster.Name().decode('utf-8')}")
print(f"HP: {monster.Hp()}")
print(f"Mana: {monster.Mana()}")
print(f"Color: {Color(monster.Color())}") # Cast the enum value to the enum type
# Accessing a nested object (Vec3)
pos = monster.Pos()
if pos:
    print(f"Position: x={pos.X()}, y={pos.Y()}, z={pos.Z()}")
# Accessing a vector (list)
print("Inventory:")
for i in range(monster.InventoryLength()):
    print(f"  {monster.Inventory(i)}")
print("Path:")
for i in range(monster.PathLength()):
    path_vec = monster.Path(i)
    print(f"  [{path_vec.X()}, {path_vec.Y()}, {path_vec.Z()}]")
print(f"Equipped: {Equipment(monster.EquippedType())}")

Step 5: Run the Scripts

  1. Build the data:

    python build_monster.py

    This will create the monsterdata.bin file.

  2. Read the data:

    python read_monster.py

Expected Output:

Name: MyMonster
HP: 80
Mana: 150
Color: Blue
Position: x=1.0, y=2.0, z=3.0
Inventory:
  0
  1
  2
  3
  4
Path:
  [1.0, 2.0, 3.0]
  [4.0, 5.0, 6.0]
Equipped: Weapon

Key Concepts in the Python API

  • flatbuffers.Builder: The main class used to construct a FlatBuffer. You add data to it in reverse order.
  • builder.CreateString(...): Creates a UTF-8 string and adds it to the buffer. Returns the offset.
  • builder.CreateByteVector(...) / CreateIntVector(...): Creates vectors of primitive types.
  • Table.StartTable(builder) / Table.EndTable(builder): The standard way to start and end a table (object) definition.
  • Table.Add... methods: Used to add fields to the table you are currently building (e.g., Monster.AddName(builder, name_offset)).
  • Monster.GetRootAsMonster(buffer, offset): The magic method for reading. It takes the raw byte buffer and returns a "reader" object that provides direct access to the data.
  • Reader Methods: The generated reader classes have methods like .Name(), .Hp(), .Pos(), .InventoryLength(), and .Inventory(index). They are fast and require no allocation.

When to Use FlatBuffers in Python

FlatBuffers is not a replacement for all data interchange formats. It excels in specific scenarios:

  • High-Performance Systems: Game engines, real-time data processing, financial trading platforms.
  • Resource-Constrained Environments: Mobile apps, embedded systems, or servers where memory and CPU are critical.
  • When you need to access data without parsing: If you have a large binary blob and only need to read a few fields, FlatBuffers lets you do that instantly without deserializing the entire object.

When you might prefer JSON or Protobuf:

  • Human Readability: JSON is the clear winner.
  • Web APIs: JSON is the standard for web APIs. Protobuf is also common but requires more setup (HTTP transcoding).
  • Simplicity: For simple scripts or data that doesn't require extreme performance, JSON is much easier to work with.
  • Dynamic Schemas: If your data structure changes frequently and you don't want to recompile, JSON's flexibility is an advantage.
分享:
扫描分享到社交APP
上一篇
下一篇