杰瑞科技汇

Python pandas如何高效重命名列?

Of course! In Pandas, renaming is a common task that can be applied to different objects: the main DataFrame index, column names, or the labels in a Series.

Python pandas如何高效重命名列?-图1
(图片来源网络,侵删)

Here’s a comprehensive guide covering all the main ways to rename in Pandas, from simple to more advanced.


Renaming Columns of a DataFrame

This is the most frequent use case. You have a DataFrame and want to change the names of its columns.

Method A: df.rename() (Recommended & Most Flexible)

This is the best method because it's explicit and safe. It doesn't change the original DataFrame unless you tell it to (inplace=True).

Key Features:

Python pandas如何高效重命名列?-图2
(图片来源网络,侵删)
  • You can rename specific columns by providing a dictionary of {old_name: new_name}.
  • You can rename all columns by passing a list of new names.
  • By default, it returns a new DataFrame, leaving the original unchanged.

Example 1: Renaming Specific Columns

import pandas as pd
# Sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Original DataFrame:
#    A  B  C
# 0  1  4  7
# 1  2  5  8
# 2  3  6  9
# Rename columns 'A' and 'C'
df_renamed = df.rename(columns={'A': 'Alpha', 'C': 'Charlie'})
print("\nRenamed DataFrame:")
print(df_renamed)
# Renamed DataFrame:
#    Alpha  B  Charlie
# 0      1  4        7
# 1      2  5        8
# 2      3  6        9
# The original DataFrame is unchanged
print("\nOriginal DataFrame is still the same:")
print(df)
# Original DataFrame is still the same:
#    A  B  C
# 0  1  4  7
# 1  2  5  8
# 2  3  6  9

Example 2: Renaming All Columns

You can pass a list of new names. The length of the list must match the number of columns.

# A list of new column names
new_columns = ['X', 'Y', 'Z']
# Rename all columns
df_all_renamed = df.rename(columns=new_columns)
print(df_all_renamed)
#    X  Y  Z
# 0  1  4  7
# 1  2  5  8
# 2  3  6  9

Example 3: Using inplace=True

Python pandas如何高效重命名列?-图3
(图片来源网络,侵删)

If you want to modify the DataFrame directly without creating a new one.

print("DataFrame before inplace rename:")
print(df)
# Rename 'B' to 'Beta' directly
df.rename(columns={'B': 'Beta'}, inplace=True)
print("\nDataFrame after inplace rename:")
print(df)
# DataFrame before inplace rename:
#    A  B  C
# 0  1  4  7
# 1  2  5  8
# 2  3  6  9
#
# DataFrame after inplace rename:
#    A  Beta  C
# 0  1     4  7
# 1  2     5  8
# 2  3     6  9

Method B: Direct Assignment (Simple but Risky)

You can assign a new list of column names directly to the df.columns attribute. This is fast but can be dangerous if your list of new names is the wrong length.

print("Original DataFrame:")
print(df)
# Create a new list of column names
new_cols = ['Col1', 'Col2', 'Col3']
# Assign directly to the columns attribute
df.columns = new_cols
print("\nDataFrame after direct assignment:")
print(df)
# Original DataFrame:
#    A  Beta  C
# 0  1     4  7
# 1  2     5  8
# 2  3     6  9
#
# DataFrame after direct assignment:
#    Col1  Col2  Col3
# 0     1     4     7
# 1     2     5     8
# 2     3     6     9

Renaming the Index (Row Labels)

The same rename() method is used, but you target the index parameter instead of columns.

# Reset index to make it a regular column for clarity in the example
df = df.reset_index()
print("Original DataFrame with index:")
print(df)
#    index  Col1  Col2  Col3
# 0      0     1     4     7
# 1      1     2     5     8
# 2      2     3     6     9
# Rename the index labels
df_renamed_index = df.rename(index={0: 'Row_1', 1: 'Row_2', 2: 'Row_3'})
print("\nDataFrame with renamed index:")
print(df_renamed_index)
#        index  Col1  Col2  Col3
# Row_1      0     1     4     7
# Row_2      1     2     5     8
# Row_3      2     3     6     9

Renaming Labels in a Series

A Series is a single column of data with an index. The logic is identical to renaming the index of a DataFrame.

# Create a sample Series
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print("Original Series:")
print(s)
# Original Series:
# a    10
# b    20
# c    30
# dtype: int64
# Rename the Series' index labels
s_renamed = s.rename({'a': 'apple', 'b': 'ball'})
print("\nRenamed Series:")
print(s_renamed)
# Renamed Series:
# apple    10
# ball     20
# c        30
# dtype: int64

Advanced Renaming with Functions

For more complex renaming, you can pass a function (like a lambda or a standard Python function) to rename(). This function will be applied to every column or index label.

Example: Converting Column Names to Uppercase

df = pd.DataFrame({'first name': ['John', 'Jane'], 'last name': ['Doe', 'Doe']})
print("Original DataFrame:")
print(df)
#   first name last name
# 0       John      Doe
# 1       Jane      Doe
# Use a lambda function to convert column names to uppercase
df_upper = df.rename(columns=str.upper)
print("\nDataFrame with uppercase columns:")
print(df_upper)
#   FIRST NAME LAST NAME
# 0       John      Doe
# 1       Jane      Doe

Example: Replacing Spaces with Underscores

# Use a lambda function to replace spaces
df_snake_case = df.rename(columns=lambda x: x.replace(' ', '_'))
print("\nDataFrame with snake_case columns:")
print(df_snake_case)
#   first_name last_name
# 0       John      Doe
# 1       Jane      Doe

Summary: Which Method to Use?

Task Recommended Method Why?
Rename specific columns df.rename(columns={'old': 'new'}) Safest and most readable. Explicitly states what's changing. Does not alter the original DataFrame by default.
Rename all columns df.columns = ['new1', 'new2', ...] Fast and simple. Use when you have a complete list of new names and are certain of the order.
Rename the index df.rename(index={0: 'row1'}) Consistent with column renaming. Explicit and clear.
Rename based on a pattern or rule df.rename(columns=lambda x: x.upper()) Powerful and flexible for applying a transformation to all labels.
Avoid inplace=True Avoid inplace=True Modern best practice is to avoid inplace. Assigning the result (e.g., df = df.rename(...)) is more predictable and easier to debug.
分享:
扫描分享到社交APP
上一篇
下一篇