杰瑞科技汇

Python中如何解析日期?

Of course! Parsing dates in Python is a common task, and the best tool for it is the datetime module from the standard library. For more complex or user-friendly parsing, the dateutil library is an excellent choice.

Python中如何解析日期?-图1
(图片来源网络,侵删)

Here's a comprehensive guide covering the most common methods.


The Standard Library: datetime.strptime()

This is the most fundamental and powerful way to parse dates when you know the exact format of the string. The name strptime stands for "string parse time".

How it works: You provide a date string and a format code that tells Python how to interpret each part of the string (e.g., %Y for a 4-digit year, %m for a 2-digit month).

Common Format Codes

Code Meaning Example
%Y Year with century 2025
%y Year without century 23
%m Month as a zero-padded number 05
%B Full month name May
%b Abbreviated month name May
%d Day of the month 09
%H Hour (24-hour clock) 14
%I Hour (12-hour clock) 02
%M Minute 30
%S Second 05
%f Microsecond 000123
%A Full weekday name Monday
%p AM/PM designation PM

Example: Parsing a Standard Format

Let's parse the date string "2025-10-27".

Python中如何解析日期?-图2
(图片来源网络,侵删)
from datetime import datetime
date_string = "2025-10-27"
format_code = "%Y-%m-%d" # Y=Year, m=Month, d=Day
# Parse the string into a datetime object
dt_object = datetime.strptime(date_string, format_code)
print(f"Original String: {date_string}")
print(f"Parsed Object: {dt_object}")
print(f"Type of object: {type(dt_object)}")
# You can now access individual components
print(f"Year: {dt_object.year}")
print(f"Month: {dt_object.month}")
print(f"Day: {dt_object.day}")

Output:

Original String: 2025-10-27
Parsed Object: 2025-10-27 00:00:00
Type of object: <class 'datetime.datetime'>
Year: 2025
Month: 10
Day: 27

Example: Parsing a Complex String

Let's parse "Friday, 27-Oct-2025 14:30:00".

from datetime import datetime
date_string = "Friday, 27-Oct-2025 14:30:00"
format_code = "%A, %d-%b-%Y %H:%M:%S"
dt_object = datetime.strptime(date_string, format_code)
print(f"Parsed Object: {dt_object}")

Output:

Parsed Object: 2025-10-27 14:30:00

The Easiest Way: dateutil.parser.parse()

Manually writing format codes for every date string can be tedious. The python-dateutil library is designed to intelligently guess the format of many common date strings.

Python中如何解析日期?-图3
(图片来源网络,侵删)

First, you need to install it:

pip install python-dateutil

How it works: You just give it the date string, and it does its best to figure it out. This is incredibly useful for logs, user input, or data from various sources.

Example: Parsing Ambiguous Dates

Notice how dateutil correctly interprets the order of day and month based on common conventions.

from dateutil import parser
# US format (Month/Day)
date_string_us = "10/27/2025"
dt_us = parser.parse(date_string_us)
print(f"Parsed US date: {dt_us}") # Interprets as Oct 27
# European format (Day/Month)
date_string_eu = "27/10/2025"
dt_eu = parser.parse(date_string_eu)
print(f"Parsed EU date: {dt_eu}") # Interprets as Oct 27
# Handles various separators and formats
date_string_various = "27-Oct-2025"
dt_various = parser.parse(date_string_various)
print(f"Parsed various date: {dt_various}")
date_string_iso = "20251027"
dt_iso = parser.parse(date_string_iso)
print(f"Parsed ISO-like date: {dt_iso}")

Output:

Parsed US date: 2025-10-27 00:00:00
Parsed EU date: 2025-10-27 00:00:00
Parsed various date: 2025-10-27 00:00:00
Parsed ISO-like date: 2025-10-27 00:00:00

Handling Ambiguity with dayfirst

Sometimes a date like 01/02/2025 is truly ambiguous. You can give dateutil a hint.

from dateutil import parser
# Ambiguous date
ambiguous_date = "01/02/2025"
# Default (guesses US format)
print(f"Default (guesses US): {parser.parse(ambiguous_date)}")
# Force day-first (European format)
print(f"Day-first: {parser.parse(ambiguous_date, dayfirst=True)}")
# Force month-first (US format)
print(f"Month-first: {parser.parse(ambiguous_date, dayfirst=False)}")

Output:

Default (guesses US): 2025-01-02 00:00:00
Day-first: 2025-02-01 00:00:00
Month-first: 2025-01-02 00:00:00

Pandas: pd.to_datetime()

If you are working with data in a Pandas DataFrame, pd.to_datetime() is the most efficient and convenient method. It's fast and can handle entire columns of date strings at once.

First, install pandas:

pip install pandas

How it works: It's very similar to dateutil.parser in that it can infer formats, but it's optimized for DataFrames. It also has more robust error handling.

Example: Parsing a DataFrame Column

import pandas as pd
# Create a sample DataFrame
data = {
    'event_date': ['2025-10-25', '26-Oct-2025', '20251027', 'October 28, 2025'],
    'event_name': ['Launch', 'Review', 'Deploy', 'Meeting']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print("\n")
# Convert the 'event_date' column to datetime objects
# This will infer the format for each string
df['event_date_parsed'] = pd.to_datetime(df['event_date'])
print("DataFrame with Parsed Dates:")
print(df)
print("\n")
# You can now easily access date components
df['year'] = df['event_date_parsed'].dt.year
df['month'] = df['event_date_parsed'].dt.month
df['day_of_week'] = df['event_date_parsed'].dt.day_name()
print("DataFrame with Date Components:")
print(df)

Output:

Original DataFrame:
   event_date event_name
0  2025-10-25     Launch
1  26-Oct-2025     Review
2     20251027     Deploy
3  October 28, 2025  Meeting
DataFrame with Parsed Dates:
  event_date event_name event_date_parsed
0  2025-10-25     Launch        2025-10-25
1  26-Oct-2025     Review        2025-10-26
2     20251027     Deploy        2025-10-27
3  October 28, 2025  Meeting      2025-10-28
DataFrame with Date Components:
  event_date event_name event_date_parsed  year  month day_of_week
0  2025-10-25     Launch        2025-10-25  2025     10     Thursday
1  26-Oct-2025     Review        2025-10-26  2025     10     Friday
2     20251027     Deploy        2025-10-27  2025     10   Saturday
3  October 28, 2025  Meeting      2025-10-28  2025     10     Sunday

Summary: Which Method Should I Use?

Method When to Use Pros Cons
datetime.strptime() - You have a known, fixed format.
- You need strict control and validation.
- You don't want external dependencies.
- Very fast.
- Strict and predictable.
- Part of the standard library.
- Requires manual format codes.
- Fails if the string format is even slightly different.
dateutil.parser.parse() - Parsing user input, logs, or messy data with unknown or varying formats.
- When you need flexibility and convenience.
- Extremely flexible.
- Infers formats automatically.
- Handles many edge cases.
- Slower than strptime.
- Can make mistakes with ambiguous dates (though you can guide it).
- Requires an external library (pip install).
pd.to_datetime() - You are working with Pandas DataFrames.
- Parsing entire columns of dates.
- Highly optimized for performance on large datasets.
- Integrates seamlessly with Pandas operations (.dt accessor).
- Good error handling (errors='coerce').
- Overkill for parsing a single date string.
- Requires the Pandas library.

Final Recommendation:

  • For simple, predictable, and high-performance parsing, use datetime.strptime().
  • For messy, unpredictable, or user-generated data, use dateutil.parser.parse().
  • If your data is already in a Pandas DataFrame, use pd.to_datetime().
分享:
扫描分享到社交APP
上一篇
下一篇