杰瑞科技汇

Python basemanger是什么?如何使用?

Of course! The term "Python BaseManager" most commonly refers to the multiprocessing.managers.BaseManager class. It's a powerful and fundamental tool in Python's multiprocessing library, designed for inter-process communication (IPC).

Python basemanger是什么?如何使用?-图1
(图片来源网络,侵删)

Let's break it down in detail.


What is a BaseManager?

Imagine you have several Python processes running. By default, they have separate memory spaces. A variable created in one process is invisible and inaccessible to another. This is called process isolation.

BaseManager solves this problem. It allows you to create and manage shared objects that live in a special "manager process." Other processes can then access and manipulate these shared objects as if they were local.

Think of it like a shared data server:

Python basemanger是什么?如何使用?-图2
(图片来源网络,侵删)
  • Manager Process: Acts as the server. It holds the actual data (e.g., a list, a dictionary, a custom object).
  • Client Processes: Act as clients. They don't hold the data themselves. Instead, they get a "proxy" object. When they interact with this proxy (e.g., call my_proxy.append(10)), the request is sent over a network-like connection to the manager process, which then performs the operation on the real data.

Key Concepts

  1. Manager Process: A central process that owns and manages the shared objects. This process must be running before any client processes can connect to it.
  2. Shared Objects: The actual data structures (like list, dict, Namespace, or custom classes) that you want to share.
  3. Proxy Objects: Lightweight objects that live in the client processes. They are the "handles" or "references" to the shared objects in the manager process. All operations on a proxy are marshalled and sent to the manager.
  4. Registry: A dictionary-like structure in the manager that maps names (strings) to the shared objects. Client processes use these names to get a proxy for an object.

Why Use BaseManager? (Pros and Cons)

✅ Pros

  • Supports Complex Data Types: Unlike multiprocessing.Queue or multiprocessing.Pipe which are mostly for simple data passing, BaseManager can share almost any Python object, including custom classes, nested lists/dictionaries, and more.
  • Flexible Access Control: You can register types with custom creation functions and arguments, allowing for very flexible setup.
  • Decoupled Architecture: The client processes don't need to know how the data is stored or managed, only how to get a proxy. This promotes cleaner code.

❌ Cons

  • Performance Overhead: Because every operation involves a network-like communication (even if it's on the same machine), it is significantly slower than using shared memory tools like multiprocessing.Value or multiprocessing.Array.
  • Complexity: It's more complex to set up than simple queues or pipes. You have to start the manager, register types, and have clients connect to it.

When to use it: Use BaseManager when you need to share complex, mutable state between processes and the performance overhead is acceptable. It's ideal for scenarios like a master-worker pattern where workers need to update a shared results list or a central task queue.


How to Use BaseManager: A Step-by-Step Example

Let's create a classic example: a shared list that multiple worker processes can append to.

Step 1: Create the Manager and Register Types

First, we need a script that will start the manager. This manager will create a shared list and register it so other processes can access it.

# manager_process.py
import multiprocessing
# 1. A function to create the shared object
def get_list():
    return []
# 2. Create the manager
if __name__ == '__main__':
    manager = multiprocessing.managers.BaseManager(address=('', 50000), authkey=b'abracadabra')
    # 3. Register the type 'my_list' with the manager
    # The first argument is the name clients will use to get the proxy.
    # The second is the callable to create the object.
    # The third is the type of the object.
    manager.register('get_my_list', callable=get_list, exposed=['append', 'pop', '__len__'])
    # 4. Start the manager process
    print('Starting manager process...')
    manager.start()
    # The manager is now running in the background.
    # We can keep the main process alive or let it exit.
    # For this example, we'll just wait.
    try:
        while True:
            pass
    except KeyboardInterrupt:
        print('Shutting down manager...')
        manager.shutdown()

Step 2: Create the Client Processes

Now, let's create the worker processes that will connect to this manager and use the shared list.

# worker_process.py
import multiprocessing
import time
def worker(task_id, manager_address, authkey):
    """
    A worker function that connects to the manager and appends to a shared list.
    """
    # Connect to the existing manager
    manager = multiprocessing.managers.BaseManager(address=manager_address, authkey=authkey)
    manager.connect()
    # Get a proxy for the shared list using the registered name
    shared_list = manager.get_my_list()
    print(f'Worker {task_id}: Connected. Current list length: {len(shared_list)}')
    # Simulate doing some work
    time.sleep(1)
    # Append to the shared list via the proxy
    shared_list.append(f'Result from worker {task_id}')
    print(f'Worker {task_id}: Appended item. New list length: {len(shared_list)}')
if __name__ == '__main__':
    # The address and authkey must match the manager's
    MANAGER_ADDRESS = ('', 50000)
    AUTHKEY = b'abracadabra'
    # Create and start multiple worker processes
    processes = []
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i, MANAGER_ADDRESS, AUTHKEY))
        processes.append(p)
        p.start()
    # Wait for all workers to finish
    for p in processes:
        p.join()
    print("All workers finished.")

Step 3: Run the Example

  1. First, start the manager process in one terminal:

    python manager_process.py

    You will see the output: Starting manager process... and it will wait.

  2. Next, run the client script in another terminal (or the same one after the manager is running):

    python worker_process.py

Expected Output (in the worker terminal):

Worker 0: Connected. Current list length: 0
Worker 1: Connected. Current list length: 0
Worker 4: Connected. Current list length: 0
Worker 3: Connected. Current list length: 0
Worker 2: Connected. Current list length: 0
Worker 0: Appended item. New list length: 1
Worker 1: Appended item. New list length: 2
Worker 2: Appended item. New list length: 3
Worker 3: Appended item. New list length: 4
Worker 4: Appended item. New list length: 5
All workers finished.

Notice how each worker sees the list growing as other workers append to it. This demonstrates the shared state working correctly.


A Simpler multiprocessing.Manager

For many common use cases, Python provides a convenient shortcut: multiprocessing.Manager(). This is a pre-configured BaseManager that already has common types like list, dict, Namespace, and Queue registered.

Here's how you'd achieve the same result more simply:

# simple_manager_example.py
import multiprocessing
def worker(task_id, shared_list_proxy):
    print(f'Worker {task_id}: Starting. Current length: {len(shared_list_proxy)}')
    time.sleep(1)
    shared_list_proxy.append(f'Result from worker {task_id}')
    print(f'Worker {task_id}: Finished. New length: {len(shared_list_proxy)}')
if __name__ == '__main__':
    # Create a Manager instance
    with multiprocessing.Manager() as manager:
        # Create a shared list using the manager
        shared_list = manager.list()
        # Create and start worker processes
        processes = []
        for i in range(5):
            p = multiprocessing.Process(target=worker, args=(i, shared_list))
            processes.append(p)
            p.start()
        # Wait for all workers to complete
        for p in processes:
            p.join()
        print("\n--- Final Shared List ---")
        print(list(shared_list)) # Convert the proxy to a regular list to see its contents
        print("-------------------------")

This is much cleaner and is the recommended approach unless you need to register highly custom types or behaviors. multiprocessing.Manager is just a BaseManager under the hood.

Summary

Feature multiprocessing.BaseManager multiprocessing.Manager
Purpose Low-level tool for creating custom IPC systems. High-level convenience wrapper for BaseManager.
Complexity High. You must manage the manager process manually. Low. Handles manager creation/shutdown via a context manager (with).
Use Case When you need to share non-standard objects or have fine-grained control. For 95% of cases, especially when sharing standard list, dict, etc.
Performance Slower than shared memory, same as Manager. Slower than shared memory, same as BaseManager.
分享:
扫描分享到社交APP
上一篇
下一篇