Python如何高效删除指定列？-杰瑞科技汇

Of course! Dropping columns is a very common operation in data analysis with Python. The primary library for this is Pandas.

（图片来源网络，侵删）

Here’s a comprehensive guide covering the most common methods, from the simplest to more advanced use cases.

The Setup: First, Create a Sample DataFrame

All the examples below will use this sample DataFrame. It's good practice to create a small, reproducible example like this to follow along.

import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {
    'student_id': [101, 102, 103, 104],
    'first_name': ['Alice', 'Bob', 'Charlie', 'David'],
    'last_name': ['Smith', 'Johnson', 'Brown', 'Lee'],
    'age': [20, 21, 19, 22],
    'major': ['Physics', 'Math', 'Chemistry', 'Biology'],
    'grade_level': ['Sophomore', 'Junior', 'Freshman', 'Senior'],
    'tuition_fee': [10000, 10500, 9800, 11000],
    'has_scholarship': [True, False, True, False]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Original DataFrame:

   student_id first_name last_name  age     major grade_level  tuition_fee  has_scholarship
0         101      Alice     Smith   20    Physics    Sophomore         10000             True
1         102        Bob   Johnson   21      Math       Junior         10500            False
2         103    Charlie     Brown   19  Chemistry    Freshman          9800             True
3         104      David       Lee   22    Biology      Senior         11000            False

Method 1: `df.drop()` (The Most Common Method)

This is the standard and most flexible way to drop columns. The key is to use the axis=1 parameter.

（图片来源网络，侵删）

axis=0 refers to rows.
axis=1 refers to columns.

Syntax

df.drop(columns=['column_name1', 'column_name2'], axis=1, inplace=False)

Parameters

labels: A single label or a list-like object of the column names to drop.
axis: Set to 1 (or 'columns') to drop columns. This is the most important parameter.
inplace: A boolean.
- inplace=False (default): Returns a new DataFrame with the columns dropped. The original df is unchanged. This is safer and generally recommended.
- inplace=True: Modifies the original DataFrame directly and returns None. This can be slightly more memory-efficient for very large DataFrames but can lead to bugs if you're not careful.

Example 1: Dropping a Single Column

Let's drop the last_name column. We'll use inplace=False to show that the original DataFrame remains unchanged.

# Create a copy to demonstrate inplace=False
df_copy = df.copy()
# Drop the 'last_name' column
df_dropped = df_copy.drop(columns=['last_name'])
print("\nDataFrame after dropping 'last_name' (inplace=False):")
print(df_dropped)
print("\nOriginal DataFrame is unchanged:")
print(df_copy)

Output:

DataFrame after dropping 'last_name' (inplace=False):
   student_id first_name  age     major grade_level  tuition_fee  has_scholarship
0         101      Alice   20    Physics    Sophomore         10000             True
1         102        Bob   21      Math       Junior         10500            False
2         103    Charlie   19  Chemistry    Freshman          9800             True
3         104      David   22    Biology      Senior         11000            False
Original DataFrame is unchanged:
   student_id first_name last_name  age     major grade_level  tuition_fee  has_scholarship
0         101      Alice     Smith   20    Physics    Sophomore         10000             True
1         102        Bob   Johnson   21      Math       Junior         10500            False
2         103    Charlie     Brown   19  Chemistry    Freshman          9800             True
3         104      David       Lee   22    Biology      Senior         11000            False

Example 2: Dropping Multiple Columns

You can pass a list of column names to the columns argument.

# Drop 'first_name' and 'last_name' columns
df_dropped_multiple = df.drop(columns=['first_name', 'last_name'])
print("\nDataFrame after dropping multiple columns:")
print(df_dropped_multiple)

Output:

（图片来源网络，侵删）

DataFrame after dropping multiple columns:
   student_id  age     major grade_level  tuition_fee  has_scholarship
0         101   20    Physics    Sophomore         10000             True
1         102   21      Math       Junior         10500            False
2         103   19  Chemistry    Freshman          9800             True
3         104   22    Biology      Senior         11000            False

Example 3: Using `inplace=True`

This modifies the DataFrame directly. Use with caution!

# Modifying the original df
df.drop(columns=['tuition_fee', 'has_scholarship'], inplace=True)
print("\nOriginal DataFrame after inplace=True:")
print(df)

Output:

Original DataFrame after inplace=True:
   student_id first_name last_name  age     major grade_level
0         101      Alice     Smith   20    Physics    Sophomore
1         102        Bob   Johnson   21      Math       Junior
2         103    Charlie     Brown   19  Chemistry    Freshman
3         104      David       Lee   22    Biology      Senior

Notice that the original df is now permanently changed.

Method 2: Selecting Columns to Keep (Often More Robust)

Instead of thinking about what to remove, you can think about what to keep. This is often safer, especially in automated scripts, because if a column you expect to drop is missing, your code won't error out.

You select the columns you want and assign the result back to a variable (or use inplace).

# Let's restore the original df first
df = pd.DataFrame(data)
# Select only the columns you want to keep
df_kept = df[['student_id', 'first_name', 'age', 'major']]
print("\nDataFrame keeping only selected columns:")
print(df_kept)

Output:

DataFrame keeping only selected columns:
   student_id first_name  age     major
0         101      Alice   20    Physics
1         102        Bob   21      Math
2         103    Charlie   19  Chemistry
3         104      David   22    Biology

Method 3: Dropping Columns Based on a Condition

Sometimes you want to drop columns that meet a certain criteria, like having all NaN values or a specific data type.

Example A: Dropping Columns with All `NaN` Values

This is useful for cleaning data after an operation that might have introduced empty columns.

# Add a column of all NaN values
df['empty_col'] = np.nan
print("\nDataFrame with an empty column:")
print(df)
# Drop columns where all values are NaN
df_dropped_nan = df.dropna(axis=1, how='all')
print("\nDataFrame after dropping all-NaN columns:")
print(df_dropped_nan)

Output:

DataFrame with an empty column:
   student_id first_name last_name  age     major grade_level  tuition_fee  has_scholarship  empty_col
0         101      Alice     Smith   20    Physics    Sophomore         10000             True        NaN
1         102        Bob   Johnson   21      Math       Junior         10500            False        NaN
2         103    Charlie     Brown   19  Chemistry    Freshman          9800             True        NaN
3         104      David       Lee   22    Biology      Senior         11000            False        NaN
DataFrame after dropping all-NaN columns:
   student_id first_name last_name  age     major grade_level  tuition_fee  has_scholarship
0         101      Alice     Smith   20    Physics    Sophomore         10000             True
1         102        Bob   Johnson   21      Math       Junior         10500            False
2         103    Charlie     Brown   19  Chemistry    Freshman          9800             True
3         104      David       Lee   22    Biology      Senior         11000            False

Example B: Dropping Columns by Data Type

You can filter columns based on their dtype.

# Let's restore the original df
df = pd.DataFrame(data)
# Identify columns to drop (e.g., all object/string columns)
cols_to_drop = df.select_dtypes(include=['object']).columns
# Drop those columns
df_dropped_type = df.drop(columns=cols_to_drop)
print("\nDataFrame after dropping all object/string columns:")
print(df_dropped_type)

Output:

DataFrame after dropping all object/string columns:
   student_id  age  tuition_fee  has_scholarship
0         101   20         10000             True
1         102   21         10500            False
2         103   19          9800             True
3         104   22         11000            False

Summary: Which Method to Use?

Scenario	Recommended Method	Why?
Dropping a known list of columns	`df.drop(columns=[...], inplace=False)`	Clear, explicit, and safe (doesn't modify original data).
You are sure which columns to remove and performance is critical	`df.drop(columns=[...], inplace=True)`	Modifies DataFrame in-place, saving memory on very large DataFrames.
The list of columns to drop might change or is dynamic	`df_kept = df[[col1, col2, ...]]`	More robust. Won't error if a column is missing.
Dropping columns that are empty or mostly empty	`df.dropna(axis=1, how='all' or 'any')`	The idiomatic way to handle `NaN`-based filtering.
Dropping columns based on their data type (e.g., all strings)	`df.drop(columns=df.select_dtypes(...).columns)`	Powerful and flexible for type-based cleaning.

Python如何高效删除指定列？

The Setup: First, Create a Sample DataFrame

Method 1: `df.drop()` (The Most Common Method)

Syntax

Parameters

Example 1: Dropping a Single Column

Example 2: Dropping Multiple Columns

Example 3: Using `inplace=True`

Method 2: Selecting Columns to Keep (Often More Robust)

Method 3: Dropping Columns Based on a Condition

Example A: Dropping Columns with All `NaN` Values

Example B: Dropping Columns by Data Type

Summary: Which Method to Use?

99ANYc3cd6

PowerPoint教程从哪开始学？新手必看技巧有哪些？

Java编码GBK为何出现不可映射字符？

Java正则Pattern如何高效匹配与分组？

Java.sql.date为何丢失时分秒信息？

Java版CRM系统如何选型与落地？

Python自然语言处理如何高效处理PDF文档？

Access数据库视频教程该怎么学？

雨林木风Win7安装教程怎么操作？

Python str如何转unicode？

proe5.0视频教程哪里能下载？

Python Windows安装模块，常见问题有哪些？

Time Machine教程怎么用？备份恢复全攻略？

Java如何高效实现select列表查询？

Java反射机制的核心作用是什么？

Python JSON 如何处理 Unicode？

Office 2007视频教程从哪开始学？

Python如何高效删除指定列？

The Setup: First, Create a Sample DataFrame

Method 1: df.drop() (The Most Common Method)

Syntax

Parameters

Example 1: Dropping a Single Column

Example 2: Dropping Multiple Columns

Example 3: Using inplace=True

Method 2: Selecting Columns to Keep (Often More Robust)

Method 3: Dropping Columns Based on a Condition

Example A: Dropping Columns with All NaN Values

Example B: Dropping Columns by Data Type

Summary: Which Method to Use?

相关推荐

Python自然语言处理如何高效处理PDF文档？

Method 1: `df.drop()` (The Most Common Method)

Example 3: Using `inplace=True`

Example A: Dropping Columns with All `NaN` Values