杰瑞科技汇

Python如何处理Excel中的日期数据?

Of course! Working with dates in Excel using Python is a common task, but it can be tricky because Excel and Python handle dates differently.

Python如何处理Excel中的日期数据?-图1
(图片来源网络,侵删)

Here’s a comprehensive guide covering the most popular libraries (pandas, openpyxl, xlsxwriter) and explaining the core concepts.

The Core Problem: Date Serial Numbers

The most important thing to understand is that Excel stores dates as numbers, not as a "date" type.

  • Excel's System: It starts counting from January 1, 1900. This day is serial number 1. January 2, 1900, is 2, and so on. For dates after 1900, it's the number of days since that starting point. Times are stored as fractional parts (e.g., 0.5 for noon).
  • Python's System: Python's standard datetime object is a true date and time representation. It's not a single number.

Your job when moving between Excel and Python is often to convert this Excel serial number to a Python datetime object, and vice-versa.


Method 1: The Easiest Way with pandas

pandas is the go-to library for data analysis in Python. Its Excel reading/writing capabilities are built on top of other libraries but provide a very user-friendly interface.

Python如何处理Excel中的日期数据?-图2
(图片来源网络,侵删)

Reading Dates with pandas.read_excel()

By default, pandas will try to parse columns that look like dates into its own Timestamp objects (which are very similar to datetime objects).

Setup: First, let's create a sample Excel file named dates_sample.xlsx.

Date Value
2025-10-26 100
2025-10-27 150
2025-10-28 200

Python Code to Read:

import pandas as pd
# Read the Excel file
# pandas will automatically parse the 'Date' column into datetime objects
df = pd.read_excel('dates_sample.xlsx', engine='openpyxl')
print("--- DataFrame Head ---")
print(df.head())
print("\n--- Data Types ---")
# Check the data type of the 'Date' column
print(df.info())
# Now you can use pandas datetime functions
print("\n--- Extracting Year, Month, Day ---")
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day
print(df)

Output:

Python如何处理Excel中的日期数据?-图3
(图片来源网络,侵删)
--- DataFrame Head ---
        Date  Value
0 2025-10-26    100
1 2025-10-27    150
2 2025-10-28    200
--- Data Types ---
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Date    3 non-null      datetime64[ns]
 1   Value   3 non-null      int64
dtypes: datetime64[ns](1), int64(1)
memory usage: 176.0 bytes
None
--- Extracting Year, Month, Day ---
        Date  Value  Year  Month  Day
0 2025-10-26    100  2025     10   26
1 2025-10-27    150  2025     10   27
2 2025-10-28    200  2025     10   28

What if it doesn't work? If pandas reads the date as a regular string or number, you can force it to parse a column using the parse_dates argument:

# Force pandas to parse the 'Date' column
df = pd.read_excel('dates_sample.xlsx', parse_dates=['Date'], engine='openpyxl')

Writing Dates with pandas.DataFrame.to_excel()

When you write a DataFrame with a datetime column to Excel, pandas automatically converts it back to Excel's date format.

import pandas as pd
from datetime import datetime
# Create a new DataFrame with datetime objects
data = {
    'Transaction_Date': [datetime(2025, 1, 15), datetime(2025, 2, 20), datetime(2025, 3, 10)],
    'Amount': [50.00, 120.50, 75.25]
}
df_to_write = pd.DataFrame(data)
# Write to a new Excel file
df_to_write.to_excel('output_with_pandas.xlsx', index=False, engine='openpyxl')
print("DataFrame written to 'output_with_pandas.xlsx'")

When you open output_with_pandas.xlsx, the dates will be displayed correctly as dates, not as numbers or text.


Method 2: Using openpyxl (More Control)

openpyxl is a library for reading and writing .xlsx files. It gives you more granular control than pandas but requires you to handle date conversions manually.

Reading Dates with openpyxl

You need to check if a cell's data type is a date and then convert its value.

from openpyxl import load_workbook
from datetime import datetime, date
# Load the workbook and select the active sheet
wb = load_workbook('dates_sample.xlsx')
ws = wb.active
print("--- Reading cell by cell ---")
for row in ws.iter_rows(values_only=True):
    # row will be a tuple like (datetime.date(2025, 10, 26), 100)
    # openpyxl often converts dates to Python's native datetime.date objects
    print(row)
# To get the raw Excel serial number, you can access the cell's internal value
print("\n--- Reading raw Excel serial number ---")
cell = ws['A1']
print(f"Cell A1 value as Python object: {cell.value}") # Will be datetime.date(2025, 10, 26)
print(f"Cell A1 internal value (Excel serial): {cell.internal_value}") # Will be 45222.0

Key Point: openpyxl is smart enough to often convert the Excel serial number back into a Python datetime or date object for you, which is very convenient.

Writing Dates with openpyxl

You need to create a datetime object and assign it to the cell. openpyxl will automatically handle the conversion to the Excel serial number.

from openpyxl import Workbook
from datetime import datetime
# Create a new workbook and select the active sheet
wb = Workbook()
ws = wb.active= "Sales Data"
# Write headers
ws.append(['Sale Date', 'Product', 'Price'])
# Write data with datetime objects
ws.append([datetime(2025, 5, 15), 'Laptop', 1200])
ws.append([datetime(2025, 5, 16), 'Mouse', 25])
ws.append([datetime(2025, 5, 17), 'Keyboard', 75])
# Save the file
wb.save('output_with_openpyxl.xlsx')
print("Workbook 'output_with_openpyxl.xlsx' created.")

Method 3: Using xlsxwriter (Writing Only)

xlsxwriter is a powerful library for creating new Excel files from scratch. It has excellent support for formatting. It cannot modify existing files.

The key is to use the xlsxwriter write_datetime() method.

import xlsxwriter
from datetime import datetime
# Create a new Excel file and add a worksheet
workbook = xlsxwriter.Workbook('output_with_xlsxwriter.xlsx')
worksheet = workbook.add_worksheet()
# Add a format for the date cells (optional but recommended)
date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
# Write headers
worksheet.write('A1', 'Date')
worksheet.write('B1', 'Event')
# Write data using write_datetime()
# It takes (row, col, datetime_object, format_object)
worksheet.write_datetime(1, 0, datetime(2025, 8, 1), date_format)
worksheet.write_datetime(2, 0, datetime(2025, 8, 2), date_format)
worksheet.write_datetime(3, 0, datetime(2025, 8, 3), date_format)
# Write some regular data
worksheet.write(1, 1, 'Project Kickoff')
worksheet.write(2, 1, 'First Milestone')
worksheet.write(3, 1, 'Final Review')
# Close the workbook to save the file
workbook.close()
print("Workbook 'output_with_xlsxwriter.xlsx' created.")

Why write_datetime()? Using worksheet.write(1, 0, datetime(...)) might sometimes work, but write_datetime() is the correct, explicit method that guarantees Excel will interpret the value as a date. The num_format argument controls how the date is displayed in Excel.


Summary and Cheat Sheet

Task pandas openpyxl xlsxwriter
Read Excel pd.read_excel() load_workbook() Not for reading
Write Excel df.to_excel() workbook.save() workbook.close()
Read Date Automatic, or parse_dates=['col'] cell.value (often a datetime object) N/A
Write Date Assign datetime to DataFrame cell Assign datetime object to cell Use worksheet.write_datetime()
Best For Data analysis, quick reads/writes Reading & modifying existing files Creating new, formatted files

Installation

You'll need to install these libraries. openpyxl is required by pandas for .xlsx files.

pip install pandas openpyxl xlsxwriter
分享:
扫描分享到社交APP
上一篇
下一篇