Python MemoryCache如何高效管理内存？-杰瑞科技汇

Of course! Let's dive into Python's memorycache, which is more commonly known as functools.lru_cache. This is a powerful and easy-to-use decorator for adding a cache to your functions.

（图片来源网络，侵删）

What is a Memory Cache?

A memory cache is a technique used to store the results of expensive function calls and return the cached result when the same inputs occur again. This is a form of memoization.

Goal: Speed up your application by avoiding redundant computations.
Trade-off: You use more memory (RAM) to store the results, but you save CPU time.

Think of it like a librarian who remembers the location of frequently requested books. The next time someone asks for the same book, the librarian doesn't have to look it up in the giant catalog; they just go to the shelf and grab it. The "book location" is the cached result.

The Main Tool: `functools.lru_cache`

Python's standard library provides the lru_cache decorator in the functools module. "LRU" stands for Least Recently Used.

How it works: It stores the results of function calls in a dictionary. When the function is called again, it first checks if the arguments are in the dictionary's keys.
- If yes (a cache hit), it returns the stored result immediately.
- If no (a cache miss), it executes the function, stores the result in the cache, and then returns it.
lru_cache eviction policy: If the cache becomes full, it discards the least recently used item to make space for a new one.

How to Use `lru_cache`

It's incredibly simple to use. Just add the decorator directly above your function definition.

（图片来源网络，侵删）

Basic Example

Let's create a function that simulates a slow, CPU-intensive operation, like calculating a factorial.

import time
import functools
# Without a cache
def slow_factorial(n):
    """Calculates the factorial of n, with a deliberate delay."""
    print(f"Calculating factorial for {n}...")
    time.sleep(1) # Simulate a slow calculation
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result
# --- Let's test it ---
print("--- Without Cache ---")
start_time = time.time()
print(f"5! = {slow_factorial(5)}")
print(f"5! = {slow_factorial(5)}") # This will be slow again!
print(f"7! = {slow_factorial(7)}")
end_time = time.time()
print(f"Total time: {end_time - start_time:.2f} seconds")

Now, let's add lru_cache:

import time
import functools
# With a cache
@functools.lru_cache(maxsize=None) # maxsize=None means the cache can grow indefinitely
def cached_slow_factorial(n):
    """Calculates the factorial of n, but caches the results."""
    print(f"Calculating factorial for {n}...")
    time.sleep(1) # Simulate a slow calculation
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result
# --- Let's test it ---
print("\n--- With Cache ---")
start_time = time.time()
print(f"5! = {cached_slow_factorial(5)}")
print(f"5! = {cached_slow_factorial(5)}") # This will be instantaneous!
print(f"7! = {cached_slow_factorial(7)}")
print(f"5! = {cached_slow_factorial(5)}") # Also instantaneous!
end_time = time.time()
print(f"Total time: {end_time - start_time:.2f} seconds")

Expected Output:

--- Without Cache ---
Calculating factorial for 5...
5! = 120
Calculating factorial for 5...
5! = 120
Calculating factorial for 7...
7! = 5040
Total time: 3.01 seconds
--- With Cache ---
Calculating factorial for 5...
5! = 120
5! = 120
Calculating factorial for 7...
7! = 5040
5! = 120
Total time: 2.01 seconds

Notice how the second call to cached_slow_factorial(5) was nearly instantaneous because the result was already in the cache.

（图片来源网络，侵删）

Key Parameters of `lru_cache`

The decorator is flexible and offers a few important parameters:

maxsize:
- Purpose: Sets the maximum number of recent calls to cache.
- Default: 128.
- Value: If you set maxsize=None, the cache can grow without bound. This is great if you have memory to spare and want to cache everything. For most applications, a finite number (e.g., 1024) is a good trade-off between memory and performance.
typed:
- Purpose: If True, arguments of different types will be cached separately.
- Default: False.
- Example: With typed=False, the calls my_func(1) and my_func(1.0) are considered the same and will use the same cache entry. With typed=True, they are treated as different calls and will have separate cache entries.

Advanced Features and Best Practices

Inspecting the Cache

The lru_cache decorator adds useful attributes to your function, which are great for debugging and monitoring.

cache_info(): Returns a named tuple with statistics about the cache.
- hits: Number of cache hits.
- misses: Number of cache misses.
- hitrate: The ratio of hits to total calls.
- currsize: Current number of items in the cache.
- maxsize: Maximum size of the cache.

@functools.lru_cache(maxsize=3)
def test_func(x):
    print(f"Calculating for {x}")
    return x * 2
test_func(1)
test_func(2)
test_func(3)
test_func(1) # Cache hit
test_func(4) # This will evict 1 because it's the least recently used
test_func(1) # Cache miss, 1 was evicted
test_func(2) # Cache hit
print("\nCache Info:")
print(test_func.cache_info())

Output:

Calculating for 1
Calculating for 2
Calculating for 3
Calculating for 1
Calculating for 4
Calculating for 1
Cache Info:
CacheInfo(hits=2, misses=4, maxsize=3, currsize=3)

Clearing the Cache

If the underlying data your function depends on changes, your cached results will become stale. You can clear the cache manually.

cache_clear(): Empties the cache.

@functools.lru_cache(maxsize=None)
def get_data_from_db(user_id):
    print(f"Querying database for user {user_id}...")
    # Simulate database lookup
    return {"name": "John Doe", "id": user_id}
print(get_data_from_db(101))
print(get_data_from_db(101)) # Cached
# Imagine the user's name changes in the database
print("\nClearing cache...")
get_data_from_db.cache_clear()
print(get_data_from_db(101)) # Will hit the database again

Output:

Querying database for user 101...
{'name': 'John Doe', 'id': 101}
{'name': 'John Doe', 'id': 101}
Clearing cache...
Querying database for user 101...
{'name': 'John Doe', 'id': 101}

Important: Caching with Mutable Arguments

A major rule of thumb: Do not use lru_cache with functions that take mutable arguments (like lists or dictionaries) if you intend to call the function with different objects that have the same content.

The cache uses the arguments as keys in a dictionary. Mutable objects are not hashable and cannot be used as keys. Even if they were, two different lists with the same content would be treated as two different keys.

# THIS WILL RAISE AN ERROR
@functools.lru_cache(maxsize=None)
def process_data(data_list):
    print("Processing data...")
    return sum(data_list)
try:
    process_data([1, 2, 3])
except TypeError as e:
    print(f"Error: {e}")

Solution: Convert the mutable argument to an immutable one. A tuple is a perfect choice.

# THIS WORKS
@functools.lru_cache(maxsize=None)
def process_data_immutable(data_tuple):
    print("Processing data...")
    return sum(data_tuple)
# Call with a tuple
process_data_immutable((1, 2, 3))
process_data_immutable((1, 2, 3)) # Cache hit

When to Use `lru_cache`

✅ Great for:

Pure functions: Functions that always return the same output for the same input and have no side effects.
Expensive I/O: Network calls, database queries, reading files.
Recursive algorithms: Fibonacci, factorial, tree traversals. This prevents re-computing the same sub-problems over and over.
Functions called repeatedly in a loop with the same arguments.

❌ Avoid for:

Functions with side effects: Caching a function that prints to the console or modifies a global variable will only run the side effect on the first call (the cache miss).
Functions with mutable arguments: As discussed, this leads to errors or unexpected behavior.
Functions that should always return a new object: For example, a function that generates a unique ID or a random number.
Very fast functions: The overhead of checking the cache might be more than the cost of just running the function.

Alternatives

While lru_cache is fantastic for in-memory caching of function results, sometimes you need more.

cachetools: A third-party library that offers more caching strategies beyond LRU, such as LFU (Least Frequently Used), TTL (Time-To-Live), and more. It's a great choice if lru_cache isn't flexible enough.
Caching Databases (Redis, Memcached): For distributed applications or when you need persistence and shared caching across multiple servers, you would use an external caching service. Python libraries like redis-py or pymemcache handle this.

Python MemoryCache如何高效管理内存？

What is a Memory Cache?

The Main Tool: `functools.lru_cache`

How to Use `lru_cache`

Basic Example

Key Parameters of `lru_cache`

Advanced Features and Best Practices

Inspecting the Cache

Clearing the Cache

Important: Caching with Mutable Arguments

When to Use `lru_cache`

Alternatives

99ANYc3cd6

Python libopenblas是什么？如何配置使用？

Python Chainladder如何实现损失预测？

用光盘重装Win7系统详细步骤是什么？

静态方法为何能直接调非静态方法？

Java数组如何转成JSON字符串？

男士瑜伽入门，哪些基本动作必学？

Java中Error与Exception有何核心区别？

Java String转char数组，如何操作？

Docker+Python+Scrapy如何高效协同爬取数据？

Java连接MySQL驱动时如何正确配置？

synchronized方法如何保证线程安全？

Adobe Flash CS教程视频如何快速入门？

Photoshop CS5视频教程哪里能下载？

python socket encode

Python ResultSet如何获取长度？

Java StringTokenizer如何正确分割字符串？

Python MemoryCache如何高效管理内存？

What is a Memory Cache?

The Main Tool: functools.lru_cache

How to Use lru_cache

Basic Example

Key Parameters of lru_cache

Advanced Features and Best Practices

Inspecting the Cache

Clearing the Cache

Important: Caching with Mutable Arguments

When to Use lru_cache

Alternatives

相关推荐

男士瑜伽入门，哪些基本动作必学？

The Main Tool: `functools.lru_cache`

How to Use `lru_cache`

Key Parameters of `lru_cache`

When to Use `lru_cache`