杰瑞科技汇

Python imap fetch 如何高效获取邮件?

Of course! Here is a comprehensive guide on how to use Python's imaplib to fetch emails from an IMAP server.

Python imap fetch 如何高效获取邮件?-图1
(图片来源网络,侵删)

What is IMAP?

IMAP (Internet Message Access Protocol) is a standard email protocol used to retrieve emails from a mail server. Unlike POP3, which downloads emails and often removes them from the server, IMAP allows you to manage emails directly on the server. You can read, flag, delete, and organize emails, and these changes will be reflected on all your devices.

Prerequisites

  1. Python 3: Ensure you have Python 3 installed.
  2. An IMAP Server: We'll use Gmail for this example because it's widely accessible. You'll need a Gmail account.
  3. Enable "Less secure app access" (or use an App Password):
    • For newer accounts: Google now requires you to use an "App Password" if you have 2-Factor Authentication (2FA) enabled. This is the recommended and most secure method.
      • Go to your Google Account settings: https://myaccount.google.com/
      • Navigate to Security.
      • Under "Signing in to Google," click on 2-Step Verification. (You must have this enabled to create an App Password).
      • At the bottom, click on App passwords.
      • Generate a new password for "Mail" on "Other (Custom name)". Give it a name like "Python Script".
      • Google will give you a 16-character password. Copy this password. You will use it in your Python script instead of your regular Gmail password.
    • For older accounts: If you don't have 2FA, you might still be able to enable "Less secure app access" from https://myaccount.google.com/lesssecureapps. This is less secure and not recommended.

The Python Script: A Step-by-Step Breakdown

We will use Python's built-in imaplib library along with email to parse the fetched messages.

import imaplib
import email
from email.header import decode_header
import getpass # To securely enter your password
# --- Configuration ---
# Use your IMAP server address. For Gmail, it's 'imap.gmail.com'
IMAP_SERVER = 'imap.gmail.com'
# Use your email address
EMAIL_ACCOUNT = "your_email@gmail.com"
# Use your password or App Password
PASSWORD = "your_app_password" # Or getpass.getpass("Enter your password: ")
def decode_str(s):
    """Decodes a string from email.header.decode_header."""
    value, charset = decode_header(s)[0]
    if isinstance(value, bytes):
        try:
            return value.decode(charset or 'utf-8')
        except (UnicodeDecodeError, LookupError):
            # Fallback to a more robust decoding if utf-8 fails
            return value.decode('latin-1')
    return value
def get_email_body(msg):
    """Extracts the body text from an email message."""
    if msg.is_multipart():
        for part in msg.walk():
            content_type = part.get_content_type()
            content_disposition = str(part.get("Content-Disposition"))
            # Skip attachments
            if "attachment" in content_disposition:
                continue
            if content_type == "text/plain":
                try:
                    return part.get_payload(decode=True).decode()
                except:
                    return ""
            elif content_type == "text/html":
                # You can use a library like BeautifulSoup to parse HTML
                try:
                    return part.get_payload(decode=True).decode()
                except:
                    return ""
    else:
        # Not multipart
        content_type = msg.get_content_type()
        if content_type == "text/plain":
            try:
                return msg.get_payload(decode=True).decode()
            except:
                return ""
    return ""
def fetch_emails():
    """Connects to the IMAP server and fetches emails."""
    try:
        # 1. Establish a secure connection using SSL
        print(f"Connecting to {IMAP_SERVER}...")
        mail = imaplib.IMAP4_SSL(IMAP_SERVER)
        # 2. Login to your account
        print("Logging in...")
        mail.login(EMAIL_ACCOUNT, PASSWORD)
        print("Login successful!")
        # 3. Select the mailbox you want to check (e.g., 'INBOX')
        # It returns a tuple: (status, [message_count])
        status, messages = mail.select('INBOX')
        if status != 'OK':
            print("Could not open inbox")
            return
        # 4. Search for emails
        # 'UNSEEN' searches for unread emails. You can also use 'ALL', 'SEEN', 'FLAGGED', etc.
        # The search returns a tuple: (status, [byte_string_of_email_ids])
        status, data = mail.search(None, 'UNSEEN')
        if status != 'OK':
            print("No messages found!")
            return
        # Get the list of email IDs
        email_ids = data[0].split()
        # We will fetch the latest 5 unseen emails
        # Reverse the list to get the newest emails first
        latest_email_ids = email_ids[-5:] 
        print(f"Found {len(email_ids)} unseen emails. Fetching the latest {len(latest_email_ids)}...")
        # 5. Fetch the emails
        for email_id in latest_email_ids:
            # Fetch the email by ID in RFC822 format
            # The fetch command returns a tuple: (status, [message_data])
            status, msg_data = mail.fetch(email_id, '(RFC822)')
            if status == 'OK':
                # Parse the raw email content
                raw_email = msg_data[0][1]
                msg = email.message_from_bytes(raw_email)
                # --- Extract Email Information ---
                subject = decode_str(msg['subject'])
                from_ = decode_str(msg['from'])
                date_ = decode_str(msg['date'])
                print("-" * 50)
                print(f"Subject: {subject}")
                print(f"From: {from_}")
                print(f"Date: {date_}")
                # Get the email body
                body = get_email_body(msg)
                if body:
                    print("\n--- Body ---")
                    print(body[:500] + "...") # Print first 500 chars of the body
        # 6. Logout
        print("-" * 50)
        print("Logging out...")
        mail.logout()
        print("Done.")
    except Exception as e:
        print(f"An error occurred: {e}")
if __name__ == "__main__":
    # For security, it's better to prompt for the password
    # password = getpass.getpass(f"Enter password for {EMAIL_ACCOUNT}: ")
    # PASSWORD = password
    fetch_emails()

How to Run the Script

  1. Save the code: Save the code above into a file named fetch_emails.py.
  2. Edit Configuration:
    • Replace "your_email@gmail.com" with your actual Gmail address.
    • Replace "your_app_password" with the 16-character App Password you generated from Google.
  3. Run from the terminal:
    python fetch_emails.py

You should see output similar to this:

Connecting to imap.gmail.com...
Logging in...
Login successful!
Found 15 unseen emails. Fetching the latest 5...
--------------------------------------------------
Subject: Your Weekly Newsletter
From: newsletter@example.com
Date: Mon, 23 Oct 2025 10:00:00 -0400
--- Body ---
Welcome to this week's edition of our newsletter! Here are the top stories...
--------------------------------------------------
Subject: Project Update - Final Review
From: boss@mycompany.com
Date: Mon, 23 Oct 2025 09:30:00 -0400
--- Body ---
Team,
Please find attached the final review documents for the Q4 project...
--------------------------------------------------
Logging out...
Done.

Key Concepts Explained

imaplib.IMAP4_SSL(server)

This creates a secure connection to the IMAP server. Using SSL is crucial to protect your login credentials and email data.

Python imap fetch 如何高效获取邮件?-图2
(图片来源网络,侵删)

mail.login(email, password)

Authenticates you with the server. Never hardcode your password directly in the script if you plan to share it. Use environment variables or a secure password manager.

mail.select('mailbox')

This chooses which mailbox (folder) to work with. Common values are 'INBOX', 'Sent', 'Drafts', or 'Spam'.

mail.search(criteria, charset)

This is the most powerful command for finding emails.

  • criteria: Defines what to search for.
    • 'ALL': All emails.
    • 'UNSEEN': Unread emails.
    • 'SEEN': Read emails.
    • 'FLAGGED': Starred/flagged emails.
    • 'FROM "sender@example.com"': Emails from a specific sender.
    • 'SUBJECT "your keyword"': Emails with a specific keyword in the subject.
    • 'SINCE "23-Oct-2025"': Emails since a specific date.
  • charset: Usually None for UTF-8.
  • It returns a list of email IDs as byte strings.

mail.fetch(email_id, '(RFC822)')

Retrieves the full content of a specific email.

Python imap fetch 如何高效获取邮件?-图3
(图片来源网络,侵删)
  • email_id: The ID of the email you want to fetch.
  • '(RFC822)': This is a data item descriptor. RFC822 is the standard format for plain text email. Other options include BODY.PEEK[HEADER.FIELDS (FROM)] to fetch only specific parts.

Parsing the Email with email

The raw email data is a complex string. The email library is essential for parsing it into a structured object (email.message.Message).

  • msg['subject'], msg['from'], msg['date']: Get email headers.
  • decode_header(): Headers can contain special characters and encodings. This function safely decodes them.
  • msg.is_multipart(): Checks if the email contains both text and HTML parts, or attachments.
  • msg.walk(): Iterates through all parts of a multipart email.
  • part.get_content_type(): Identifies the part (e.g., text/plain, text/html, application/pdf).
  • part.get_payload(decode=True): Gets the actual content of the part (the body text or file data) and decodes it from base64 if necessary.
分享:
扫描分享到社交APP
上一篇
下一篇