杰瑞科技汇

Python FileWatchdog如何实时监控文件变化?

Of course! The most popular and powerful library for file system monitoring in Python is watchdog.

Python FileWatchdog如何实时监控文件变化?-图1
(图片来源网络,侵删)

Here's a comprehensive guide covering everything from basic setup to advanced usage and a complete, runnable example.

What is watchdog?

watchdog is a pure Python library and shell utility to monitor file system events. It's cross-platform (works on Windows, macOS, and Linux) and uses the most efficient native API available on each platform (like inotify on Linux, FSEvents on macOS, and ReadDirectoryChangesW on Windows).


Installation

First, you need to install the library using pip:

pip install watchdog

Core Concepts

watchdog has a few key components you need to understand:

Python FileWatchdog如何实时监控文件变化?-图2
(图片来源网络,侵删)
  1. Observer: This is the main class. It's a thread that runs in the background and looks for file system events. You typically create one Observer instance for your application.
  2. Event: An object that represents a file system event. It tells you what happened. The main events are:
    • created: A file or directory was created.
    • deleted: A file or directory was deleted.
    • modified: A file or directory was modified.
    • moved: A file or directory was moved (renamed).
  3. EventHandler: This is a base class you inherit from to create your own custom logic. You override its methods (on_created, on_deleted, etc.) to define what happens when an event occurs.

A Simple, Practical Example

Let's create a script that watches a directory for any new .txt files and prints a message when one is created.

Step 1: Create the Python file

Create a file named watcher.py and paste the following code into it.

import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
# 1. Define the event handler class
class MyEventHandler(FileSystemEventHandler):
    def on_created(self, event):
        # Check if it's a file (not a directory)
        if not event.is_directory:
            print(f"--- Detected new file: {event.src_path} ---")
# 2. Set up the observer
def start_watching(path):
    event_handler = MyEventHandler()
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()
    print(f"Started watching directory: {path}")
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observer.stop()
        print("\nStopped watching.")
    observer.join()
if __name__ == "__main__":
    # Watch the current directory ('.')
    watch_path = '.' 
    start_watching(watch_path)

Step 2: Run the script

Python FileWatchdog如何实时监控文件变化?-图3
(图片来源网络,侵删)

Open your terminal, navigate to the directory where you saved watcher.py, and run it:

python watcher.py

You will see the output: Started watching directory: .

Step 3: Test it

Now, in the same terminal window, create a new text file:

echo "Hello, watchdog!" > new_file.txt

Immediately, you will see the output from your script: --- Detected new file: /path/to/your/current/dir/new_file.txt ---


Advanced Example: Building a Live-Reloader

This is a very common use case. We'll build a simple script that watches a directory and automatically re-runs a Python script whenever it's saved.

This example uses the subprocess module to execute the target script.

import time
import subprocess
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
# The script we want to watch and run
TARGET_SCRIPT = "your_app.py" 
class ScriptReloader(FileSystemEventHandler):
    def __init__(self):
        self.process = None
        # Start the script for the first time
        self.start_script()
    def start_script(self):
        # If a process is already running, terminate it
        if self.process:
            print("\nTerminating old process...")
            self.process.terminate()
            self.process.wait()
            print("Old process terminated.")
        print(f"\nStarting {TARGET_SCRIPT}...")
        # Use Popen to run the script without blocking
        self.process = subprocess.Popen(
            ["python", TARGET_SCRIPT],
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True
        )
    def on_modified(self, event):
        # Only react to changes in our target script
        if event.src_path.endswith(TARGET_SCRIPT):
            print(f"Detected change in {TARGET_SCRIPT}. Reloading...")
            self.start_script()
    def on_created(self, event):
        if event.src_path.endswith(TARGET_SCRIPT):
            print(f"Detected creation of {TARGET_SCRIPT}. Starting...")
            self.start_script()
def start_watcher(path):
    event_handler = ScriptReloader()
    observer = Observer()
    observer.schedule(event_handler, path, recursive=False) # No need to watch subdirectories
    observer.start()
    print(f"Started watching directory: {path} for changes to {TARGET_SCRIPT}")
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        if event_handler.process:
            event_handler.process.terminate()
        observer.stop()
        print("\nStopped watching.")
    observer.join()
if __name__ == "__main__":
    # Make sure your_app.py exists in the same directory
    # Create a dummy file if it doesn't
    try:
        with open(TARGET_SCRIPT, "w") as f:
            f.write("# Your application code here\n")
            f.write("print('App started!')\n")
            f.write("time.sleep(5)\n") # Simulate a long-running task
            f.write("print('App finished!')\n")
    except FileNotFoundError:
        pass
    start_watcher('.')

How to use this advanced example:

  1. Save the code as reloader.py.
  2. Make sure you have a file named your_app.py in the same directory (the script creates one for you if it's missing).
  3. Run the reloader: python reloader.py
  4. Now, open your_app.py in an editor, make a change (e.g., change the print statement), and save it.
  5. You will see the reloader script terminate the old Python process and start a new one.

Handling Events More Robustly (Debouncing)

A common problem is that saving a file can generate multiple modified events in quick succession. This can cause your script to run multiple times unnecessarily. This is called "debouncing."

Here's a simple way to handle it by adding a delay.

import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
class DebouncedEventHandler(FileSystemEventHandler):
    def __init__(self, debounce_interval=2.0):
        super().__init__()
        self.debounce_interval = debounce_interval
        self.last_event_time = 0
        self.pending_event = None
    def on_modified(self, event):
        # Don't debounce directory events
        if event.is_directory:
            return
        self.pending_event = event
        current_time = time.time()
        # If enough time has passed since the last event, process it
        if current_time - self.last_event_time > self.debounce_interval:
            self._process_event()
    def _process_event(self):
        if self.pending_event:
            print(f"--- Processed event for: {self.pending_event.src_path} ---")
            self.last_event_time = time.time()
            self.pending_event = None
    # A small timer to check for pending events
    def run(self):
        while True:
            if self.pending_event:
                current_time = time.time()
                if current_time - self.last_event_time > self.debounce_interval:
                    self._process_event()
            time.sleep(0.1)
# --- Main execution ---
if __name__ == "__main__":
    # This example is a bit more complex as it needs a second thread for the timer
    import threading
    event_handler = DebouncedEventHandler(debounce_interval=2.0)
    observer = Observer()
    observer.schedule(event_handler, '.', recursive=False)
    # Start the debounce timer in a separate thread
    debounce_thread = threading.Thread(target=event_handler.run, daemon=True)
    observer.start()
    debounce_thread.start()
    print("Started watching with debouncing.")
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observer.stop()
        print("\nStopped watching.")
    observer.join()

Now, when you save a file, it might trigger multiple on_modified events, but the code will only process the last one after a 2-second pause.

分享:
扫描分享到社交APP
上一篇
下一篇