杰瑞科技汇

Python localcache如何实现本地缓存?

Of course! The term "localcache" in Python doesn't refer to a single, built-in module. Instead, it's a general concept for storing the results of expensive operations in memory (on the local machine) to speed up subsequent calls.

Python localcache如何实现本地缓存?-图1
(图片来源网络,侵删)

The best way to implement a local cache is by using Python's built-in functools.lru_cache decorator. It's powerful, easy to use, and part of the standard library.

Let's break down the concept and explore the best tools for the job.


The Concept: Why Cache Locally?

Imagine a function that performs a slow operation, like:

  • Reading a large file from disk.
  • Making a network request to an API.
  • Performing a complex mathematical calculation.

If you call this function multiple times with the same arguments, it's wasteful to re-run the entire operation every time. A local cache stores the results of previous calls. When the function is called again with the same arguments, the cache returns the pre-computed result instantly, bypassing the expensive operation.

Python localcache如何实现本地缓存?-图2
(图片来源网络,侵删)

This is a form of memoization.


The Easiest & Most Common Method: functools.lru_cache

lru_cache stands for Least Recently Used cache. It's a decorator that you can apply to any function. It automatically stores the results of the function calls in a dictionary. When the cache reaches a maximum size, it discards the least recently used item to make room for new ones.

How to Use lru_cache

You simply import it and add @lru_cache() above your function definition.

Example: Caching a Slow Network Request

Python localcache如何实现本地缓存?-图3
(图片来源网络,侵删)

Let's simulate a slow network request with time.sleep().

import time
from functools import lru_cache
# A "slow" function that simulates fetching data from an API
def get_user_data_from_api(user_id):
    print(f"--- Making a slow API call for user_id: {user_id} ---")
    time.sleep(2)  # Simulate 2 seconds of network delay
    # In a real scenario, you would return actual data
    return {"id": user_id, "name": f"User {user_id}"}
# --- WITHOUT CACHE ---
print("--- First call (no cache) ---")
start_time = time.time()
user1 = get_user_data_from_api(101)
print(f"Result: {user1}")
print(f"Time taken: {time.time() - start_time:.2f} seconds\n")
print("--- Second call (no cache) ---")
start_time = time.time()
user1_again = get_user_data_from_api(101) # Repeats the slow call
print(f"Result: {user1_again}")
print(f"Time taken: {time.time() - start_time:.2f} seconds\n")
# --- WITH CACHE ---
@lru_cache(maxsize=32) # maxsize is optional, but good practice
def get_user_data_cached(user_id):
    print(f"--- Making a slow API call for user_id: {user_id} (CACHED VERSION) ---")
    time.sleep(2)
    return {"id": user_id, "name": f"User {user_id}"}
print("--- First call (with cache) ---")
start_time = time.time()
cached_user1 = get_user_data_cached(101)
print(f"Result: {cached_user1}")
print(f"Time taken: {time.time() - start_time:.2f} seconds\n")
print("--- Second call (with cache) ---")
start_time = time.time()
cached_user1_again = get_user_data_cached(101) # Returns from cache instantly!
print(f"Result: {cached_user1_again}")
print(f"Time taken: {time.time() - start_time:.2f} seconds\n")
# A new call will also be slow
print("--- First call for a new user (with cache) ---")
start_time = time.time()
cached_user2 = get_user_data_cached(102)
print(f"Result: {cached_user2}")
print(f"Time taken: {time.time() - start_time:.2f} seconds\n")

Output:

--- First call (no cache) ---
--- Making a slow API call for user_id: 101 ---
Result: {'id': 101, 'name': 'User 101'}
Time taken: 2.00 seconds
--- Second call (no cache) ---
--- Making a slow API call for user_id: 101 ---
Result: {'id': 101, 'name': 'User 101'}
Time taken: 2.00 seconds
--- First call (with cache) ---
--- Making a slow API call for user_id: 101 (CACHED VERSION) ---
Result: {'id': 101, 'name': 'User 101'}
Time taken: 2.00 seconds
--- Second call (with cache) ---
Result: {'id': 101, 'name': 'User 101'}
Time taken: 0.00 seconds
--- First call for a new user (with cache) ---
--- Making a slow API call for user_id: 102 (CACHED VERSION) ---
Result: {'id': 102, 'name': 'User 102'}
Time taken: 2.00 seconds

Notice how the second call to get_user_data_cached(101) was instantaneous.

Key Features of lru_cache

  • maxsize: The maximum number of recent calls to cache. If you set maxsize=None, the cache can grow indefinitely.
  • Thread-Safety: It is thread-safe. Multiple threads can safely call the cached function.
  • Cache Information: You can inspect the cache's state.
    print(get_user_data_cached.cache_info())
    # Output: CacheInfo(hits=1, misses=2, maxsize=32, currsize=2)
    # hits: number of successful cache lookups
    # misses: number of calls that had to be computed
  • Cache Clearing: You can clear the cache manually.
    get_user_data_cached.cache_clear()
    print("Cache cleared.")

When lru_cache Isn't Enough: Custom Cache Classes

Sometimes you need more control. For example:

  • You want the cache to persist for the lifetime of your application, not just a single function.
  • You need a more complex caching strategy (e.g., Time-To-Live or TTL).
  • You want to cache objects that are not easily hashable (like lists or dictionaries).

In these cases, you can create your own cache class. A dictionary is the natural choice for the underlying storage.

Example: A Simple TTL Cache

This cache will store values and automatically expire them after a certain number of seconds.

import time
class TTLCache:
    def __init__(self, ttl_seconds=60):
        self.ttl = ttl_seconds
        self.cache = {} # Stores {key: (value, timestamp)}
    def get(self, key):
        if key in self.cache:
            value, timestamp = self.cache[key]
            # Check if the item has expired
            if time.time() - timestamp < self.ttl:
                return value
            else:
                # Item is expired, remove it
                del self.cache[key]
        return None # Key not found or expired
    def set(self, key, value):
        self.cache[key] = (value, time.time())
    def clear(self):
        self.cache.clear()
# --- Usage ---
my_cache = TTLCache(ttl_seconds=3) # Cache items for 3 seconds
my_cache.set("user:1", {"name": "Alice"})
print(f"Getting user:1 -> {my_cache.get('user:1')}")
time.sleep(2)
print(f"After 2s, getting user:1 -> {my_cache.get('user:1')}") # Still valid
time.sleep(2)
print(f"After 4s total, getting user:1 -> {my_cache.get('user:1')}") # Now expired

When Not to Use a Local Cache

  • Stateful Data: If the underlying data changes (e.g., a database record), your cache will become stale. You need a strategy to invalidate (clear) the cache when the data changes.
  • Memory Constraints: If the function returns very large objects or is called with many unique arguments, the cache can consume a lot of RAM. Always use a maxsize with lru_cache.
  • Non-Idempotent Operations: Never cache functions that have side effects, like writing to a file or sending an email. Caching a function that should only run once (e.g., send_welcome_email(user_id)) is a bug.

Summary: Which to Choose?

Method Best For Pros Cons
functools.lru_cache Most use cases. Caching pure functions with hashable arguments. - Built-in, no dependencies.
- Easy to use.
- Thread-safe.
- Built-in LRU eviction.
- Limited to function-level caching.
- No built-in TTL (though you can simulate it).
Custom Class (e.g., TTLCache) - Application-level caching.
- Need for TTL or other custom logic.
- Caching non-hashable types (with some work).
- Maximum flexibility.
- Can be tailored to specific needs (TTL, eviction policies).
- You have to write and maintain the code.
- You are responsible for thread-safety if needed.
cachetools library A great middle-ground when lru_cache is too simple but you don't want to write a full custom class. - Many built-in strategies (LRU, LFU, TTL, RRF).
- Easy to use.
- External dependency (pip install cachetools).

For 95% of local caching needs in Python, start with functools.lru_cache. It's the right tool for the job and is part of the standard library for a reason.

分享:
扫描分享到社交APP
上一篇
下一篇