Pandas的drop方法如何正确使用？-杰瑞科技汇

Of course! The drop() method in pandas is a fundamental tool for cleaning and reshaping your DataFrame. It allows you to remove rows or columns based on their labels.

（图片来源网络，侵删）

Here’s a comprehensive guide covering the most common use cases, from basic to advanced.

The Core Idea: Labels vs. Positions

First, it's crucial to understand that drop() works with labels, not integer positions by default.

Label: The name of the row (index) or column.
Position: The 0-based integer location (e.g., the 1st column, 2nd row).

You can make drop() work with positions using the axis parameter, but the default is labels.

Basic Syntax

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

The most important parameters are:

（图片来源网络，侵删）

labels: The row or column labels to drop.
axis: Specifies whether to drop from the index (axis=0, the default) or from the columns (axis=1).
inplace: If True, modifies the DataFrame directly. If False (the default), returns a new DataFrame with the rows/columns dropped.
columns: A convenient way to specify you want to drop columns (equivalent to axis=1).
index: A convenient way to specify you want to drop rows (equivalent to axis=0).

Dropping Rows

Let's start with a sample DataFrame.

import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
        'Age': [25, 30, 35, 40, 28],
        'City': ['NY', 'LA', 'SF', 'Chicago', 'Boston']}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Original DataFrame:

      Name  Age     City
0    Alice   25       NY
1      Bob   30       LA
2  Charlie   35       SF
3    David   40  Chicago
4      Eve   28    Boston

a) Dropping a Single Row by Label

Use the row's index label. Let's drop the row with index 2 ('Charlie').

# By default, axis=0 (rows), so it's optional
df_dropped_row = df.drop(labels=2)
print("\nDataFrame after dropping row with label '2':")
print(df_dropped_row)

Result:

（图片来源网络，侵删）

    Name  Age     City
0  Alice   25       NY
1    Bob   30       LA
3  David   40  Chicago
4    Eve   28    Boston

Notice the index 2 is now missing. This can be fixed later with df.reset_index().

b) Dropping Multiple Rows by Label

Pass a list of labels to the labels parameter.

# Drop rows with labels 1 and 3
df_dropped_rows = df.drop(labels=[1, 3])
print("\nDataFrame after dropping rows with labels '1' and '3':")
print(df_dropped_rows)

Result:

      Name  Age    City
0    Alice   25      NY
2  Charlie   35      SF
4      Eve   28  Boston

Dropping Columns

Now, let's remove columns. The most common way is to use the columns parameter or axis=1.

a) Dropping a Single Column

Let's drop the Age column.

# Method 1: Using the 'columns' parameter (recommended for clarity)
df_dropped_col = df.drop(columns='Age')
# Method 2: Using 'axis=1'
# df_dropped_col = df.drop(labels='Age', axis=1)
print("\nDataFrame after dropping the 'Age' column:")
print(df_dropped_col)

Result:

      Name     City
0    Alice       NY
1      Bob       LA
2  Charlie       SF
3    David  Chicago
4      Eve    Boston

b) Dropping Multiple Columns

Pass a list of column names to the columns parameter.

# Drop the 'Age' and 'City' columns
df_dropped_cols = df.drop(columns=['Age', 'City'])
print("\nDataFrame after dropping 'Age' and 'City' columns:")
print(df_dropped_cols)

Result:

      Name
0    Alice
1      Bob
2  Charlie
3    David
4      Eve

The `inplace` Parameter

This is a critical concept. It determines whether you modify the existing DataFrame or create a new one.

inplace=False (Default): Returns a new DataFrame. The original df is unchanged.
inplace=True: Modifies the DataFrame in place. It returns None, and the original df is permanently changed.

Example with `inplace=True`

print("Original DataFrame before inplace operation:")
print(df)
# Drop the 'City' column from the original DataFrame
df.drop(columns='City', inplace=True)
print("\nOriginal DataFrame after inplace operation:")
print(df)

Output:

Original DataFrame before inplace operation:
      Name  Age     City
0    Alice   25       NY
1      Bob   30       LA
2  Charlie   35       SF
3    David   40  Chicago
4      Eve   28    Boston
Original DataFrame after inplace operation:
      Name  Age
0    Alice   25
1      Bob   30
2  Charlie   35
3    David   40
4      Eve   28

Warning: Using inplace=True can be risky. If you make a mistake, the change is permanent and you might need to reload your data. For this reason, many experienced developers prefer the default (inplace=False) for its safety and clarity.

Advanced Use Cases

a) Handling Errors

What if you try to drop a label that doesn't exist? By default, pandas raises a KeyError.

# This will raise a KeyError
# df.drop(columns='Country')

You can change this behavior with the errors parameter:

errors='raise' (default): Raises an error.
errors='ignore': Does nothing if the label is not found.

# This will not raise an error, but also won't change the DataFrame
df.drop(columns='Country', errors='ignore')
print("\nDataFrame after trying to drop a non-existent column (ignored):")
print(df)

b) Dropping from a MultiIndex DataFrame

If your DataFrame has a hierarchical index, you can drop from a specific level.

# Create a MultiIndex DataFrame
index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1), ('B', 2)], names=['letter', 'number'])
df_multi = pd.DataFrame({'Data': [10, 20, 30, 40]}, index=index)
print("\nOriginal MultiIndex DataFrame:")
print(df_multi)
# Drop all rows where the 'letter' level is 'A'
df_multi_dropped = df_multi.drop(labels='A', level='letter')
print("\nMultiIndex DataFrame after dropping level 'letter' == 'A':")
print(df_multi_dropped)

Result:

Original MultiIndex DataFrame:
      Data
letter number
A      1        10
       2        20
B      1        30
       2        40
MultiIndex DataFrame after dropping level 'letter' == 'A':
      Data
letter number
B      1        30
       2        40

Common Alternatives to `drop()`

While drop() is the most direct method, other functions achieve similar goals.

a) Dropping Missing Values (`dropna()`)

This is a specialized way to drop rows or columns that contain NaN (Not a Number) values.

# Create a DataFrame with missing values
df_nan = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan], 'C': [1, 2, 3]})
print("\nDataFrame with NaNs:")
print(df_nan)
# Drop any row that has at least one NaN value
print("\nDataFrame after dropping rows with any NaNs:")
print(df_nan.dropna())
# Drop any column that has at least one NaN value
print("\nDataFrame after dropping columns with any NaNs:")
print(df_nan.dropna(axis=1))

b) Dropping Duplicates (`drop_duplicates()`)

This is used to remove duplicate rows.

# Create a DataFrame with duplicate rows
df_dup = pd.DataFrame({'A': [1, 2, 2, 3], 'B': [4, 5, 5, 6]})
print("\nDataFrame with duplicates:")
print(df_dup)
# Keep the first occurrence of each duplicate
print("\nDataFrame after dropping duplicates:")
print(df_dup.drop_duplicates())

Summary Table

Goal	Method	Example
Drop a single row by index	`df.drop(label=2)`
Drop multiple rows by index	`df.drop(labels=[1, 3])`
Drop a single column by name	`df.drop(columns='Age')`
Drop multiple columns by name	`df.drop(columns=['Age', 'City'])`
Modify the DataFrame directly	`df.drop(..., inplace=True)`
Drop rows with missing values	`df.dropna()`
Drop duplicate rows	`df.drop_duplicates()`

Pandas的drop方法如何正确使用？

The Core Idea: Labels vs. Positions

Basic Syntax

Dropping Rows

a) Dropping a Single Row by Label

b) Dropping Multiple Rows by Label

Dropping Columns

a) Dropping a Single Column

b) Dropping Multiple Columns

The `inplace` Parameter

Example with `inplace=True`

Advanced Use Cases

a) Handling Errors

b) Dropping from a MultiIndex DataFrame

Common Alternatives to `drop()`

a) Dropping Missing Values (`dropna()`)

b) Dropping Duplicates (`drop_duplicates()`)

Summary Table

99ANYc3cd6

python rabbitmq 广播

java access 2025

如何快速用PS做出专业证件照？

Python pymssql连接为何总报GBK错误？

android java xml

ubuntu安装教程16.04

tesseract 4.0 python怎么用？

Python如何用OpenCV实现头发分割？

Premiere CS6中文版教程，新手如何快速上手？

Python如何控制Minecraft生物？

Java char如何高效转String？

NoteExpress教程怎么用？从入门到精通看这篇！

Oracle 11g视频教程如何系统学习？

Flash CS6教程视频哪里能看？

ArrayList复制Java如何实现？

java arraylist 顺序

Pandas的drop方法如何正确使用？

The Core Idea: Labels vs. Positions

Basic Syntax

Dropping Rows

a) Dropping a Single Row by Label

b) Dropping Multiple Rows by Label

Dropping Columns

a) Dropping a Single Column

b) Dropping Multiple Columns

The inplace Parameter

Example with inplace=True

Advanced Use Cases

a) Handling Errors

b) Dropping from a MultiIndex DataFrame

Common Alternatives to drop()

a) Dropping Missing Values (dropna())

b) Dropping Duplicates (drop_duplicates())

Summary Table

相关推荐

ubuntu安装教程16.04

The `inplace` Parameter

Example with `inplace=True`

Common Alternatives to `drop()`

a) Dropping Missing Values (`dropna()`)

b) Dropping Duplicates (`drop_duplicates()`)