Of course! The drop() method in pandas is a fundamental tool for cleaning and reshaping your DataFrame. It allows you to remove rows or columns based on their labels.

Here’s a comprehensive guide covering the most common use cases, from basic to advanced.
The Core Idea: Labels vs. Positions
First, it's crucial to understand that drop() works with labels, not integer positions by default.
- Label: The name of the row (index) or column.
- Position: The 0-based integer location (e.g., the 1st column, 2nd row).
You can make drop() work with positions using the axis parameter, but the default is labels.
Basic Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
The most important parameters are:

labels: The row or column labels to drop.axis: Specifies whether to drop from the index (axis=0, the default) or from the columns (axis=1).inplace: IfTrue, modifies the DataFrame directly. IfFalse(the default), returns a new DataFrame with the rows/columns dropped.columns: A convenient way to specify you want to drop columns (equivalent toaxis=1).index: A convenient way to specify you want to drop rows (equivalent toaxis=0).
Dropping Rows
Let's start with a sample DataFrame.
import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 28],
'City': ['NY', 'LA', 'SF', 'Chicago', 'Boston']}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Original DataFrame:
Name Age City
0 Alice 25 NY
1 Bob 30 LA
2 Charlie 35 SF
3 David 40 Chicago
4 Eve 28 Boston
a) Dropping a Single Row by Label
Use the row's index label. Let's drop the row with index 2 ('Charlie').
# By default, axis=0 (rows), so it's optional
df_dropped_row = df.drop(labels=2)
print("\nDataFrame after dropping row with label '2':")
print(df_dropped_row)
Result:

Name Age City
0 Alice 25 NY
1 Bob 30 LA
3 David 40 Chicago
4 Eve 28 Boston
Notice the index 2 is now missing. This can be fixed later with df.reset_index().
b) Dropping Multiple Rows by Label
Pass a list of labels to the labels parameter.
# Drop rows with labels 1 and 3
df_dropped_rows = df.drop(labels=[1, 3])
print("\nDataFrame after dropping rows with labels '1' and '3':")
print(df_dropped_rows)
Result:
Name Age City
0 Alice 25 NY
2 Charlie 35 SF
4 Eve 28 Boston
Dropping Columns
Now, let's remove columns. The most common way is to use the columns parameter or axis=1.
a) Dropping a Single Column
Let's drop the Age column.
# Method 1: Using the 'columns' parameter (recommended for clarity)
df_dropped_col = df.drop(columns='Age')
# Method 2: Using 'axis=1'
# df_dropped_col = df.drop(labels='Age', axis=1)
print("\nDataFrame after dropping the 'Age' column:")
print(df_dropped_col)
Result:
Name City
0 Alice NY
1 Bob LA
2 Charlie SF
3 David Chicago
4 Eve Boston
b) Dropping Multiple Columns
Pass a list of column names to the columns parameter.
# Drop the 'Age' and 'City' columns
df_dropped_cols = df.drop(columns=['Age', 'City'])
print("\nDataFrame after dropping 'Age' and 'City' columns:")
print(df_dropped_cols)
Result:
Name
0 Alice
1 Bob
2 Charlie
3 David
4 Eve
The inplace Parameter
This is a critical concept. It determines whether you modify the existing DataFrame or create a new one.
inplace=False(Default): Returns a new DataFrame. The originaldfis unchanged.inplace=True: Modifies the DataFrame in place. It returnsNone, and the originaldfis permanently changed.
Example with inplace=True
print("Original DataFrame before inplace operation:")
print(df)
# Drop the 'City' column from the original DataFrame
df.drop(columns='City', inplace=True)
print("\nOriginal DataFrame after inplace operation:")
print(df)
Output:
Original DataFrame before inplace operation:
Name Age City
0 Alice 25 NY
1 Bob 30 LA
2 Charlie 35 SF
3 David 40 Chicago
4 Eve 28 Boston
Original DataFrame after inplace operation:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40
4 Eve 28
Warning: Using inplace=True can be risky. If you make a mistake, the change is permanent and you might need to reload your data. For this reason, many experienced developers prefer the default (inplace=False) for its safety and clarity.
Advanced Use Cases
a) Handling Errors
What if you try to drop a label that doesn't exist? By default, pandas raises a KeyError.
# This will raise a KeyError # df.drop(columns='Country')
You can change this behavior with the errors parameter:
errors='raise'(default): Raises an error.errors='ignore': Does nothing if the label is not found.
# This will not raise an error, but also won't change the DataFrame
df.drop(columns='Country', errors='ignore')
print("\nDataFrame after trying to drop a non-existent column (ignored):")
print(df)
b) Dropping from a MultiIndex DataFrame
If your DataFrame has a hierarchical index, you can drop from a specific level.
# Create a MultiIndex DataFrame
index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1), ('B', 2)], names=['letter', 'number'])
df_multi = pd.DataFrame({'Data': [10, 20, 30, 40]}, index=index)
print("\nOriginal MultiIndex DataFrame:")
print(df_multi)
# Drop all rows where the 'letter' level is 'A'
df_multi_dropped = df_multi.drop(labels='A', level='letter')
print("\nMultiIndex DataFrame after dropping level 'letter' == 'A':")
print(df_multi_dropped)
Result:
Original MultiIndex DataFrame:
Data
letter number
A 1 10
2 20
B 1 30
2 40
MultiIndex DataFrame after dropping level 'letter' == 'A':
Data
letter number
B 1 30
2 40
Common Alternatives to drop()
While drop() is the most direct method, other functions achieve similar goals.
a) Dropping Missing Values (dropna())
This is a specialized way to drop rows or columns that contain NaN (Not a Number) values.
# Create a DataFrame with missing values
df_nan = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan], 'C': [1, 2, 3]})
print("\nDataFrame with NaNs:")
print(df_nan)
# Drop any row that has at least one NaN value
print("\nDataFrame after dropping rows with any NaNs:")
print(df_nan.dropna())
# Drop any column that has at least one NaN value
print("\nDataFrame after dropping columns with any NaNs:")
print(df_nan.dropna(axis=1))
b) Dropping Duplicates (drop_duplicates())
This is used to remove duplicate rows.
# Create a DataFrame with duplicate rows
df_dup = pd.DataFrame({'A': [1, 2, 2, 3], 'B': [4, 5, 5, 6]})
print("\nDataFrame with duplicates:")
print(df_dup)
# Keep the first occurrence of each duplicate
print("\nDataFrame after dropping duplicates:")
print(df_dup.drop_duplicates())
Summary Table
| Goal | Method | Example |
|---|---|---|
| Drop a single row by index | df.drop(label=2) |
|
| Drop multiple rows by index | df.drop(labels=[1, 3]) |
|
| Drop a single column by name | df.drop(columns='Age') |
|
| Drop multiple columns by name | df.drop(columns=['Age', 'City']) |
|
| Modify the DataFrame directly | df.drop(..., inplace=True) |
|
| Drop rows with missing values | df.dropna() |
|
| Drop duplicate rows | df.drop_duplicates() |
