Of course! The sub() method in pandas is a powerful and flexible tool for performing subtraction operations on Series and DataFrames. It's part of a family of arithmetic methods that includes add(), mul(), div(), floordiv(), mod, and pow().

Let's break down sub() with clear examples.
Basic Concept
The sub() method subtracts one object from another. Its syntax is:
Series.sub(other, axis='columns', level=None, fill_value=None)
other: The value or object you want to subtract. This can be a scalar (single number), a list, a tuple, a Series, or a DataFrame.axis: Specifies the axis along which to perform the operation.'columns'(or0): Perform the operation row-wise. This is the default.'index'(or1): Perform the operation column-wise.
level: For hierarchical (MultiIndex) indices, this specifies the level to perform the operation on.fill_value: A value to use as a filler when there are missing values (NaN) in either of the objects. IfNone(default),NaNwill propagate.
sub() on a Series
This is the most common use case. You subtract a scalar or another Series from a Series.
Example A: Subtracting a Scalar
You can subtract a single number from every element in the Series.

import pandas as pd
import numpy as np
# Create a sample Series
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
# Subtract 5 from every element
result = s.sub(5)
print("Original Series:")
print(s)
print("\nSeries after subtracting 5:")
print(result)
Output:
Original Series:
a 10
b 20
c 30
d 40
dtype: int64
Series after subtracting 5:
a 5
b 15
c 25
d 35
dtype: int64
Example B: Subtracting Another Series (Element-wise)
When you subtract another Series, pandas aligns the data based on the index, not the position. This is a fundamental pandas concept.
import pandas as pd
# Create two Series with different indices
s1 = pd.Series([100, 200, 300], index=['a', 'b', 'c'])
s2 = pd.Series([10, 20, 30], index=['b', 'a', 'd'])
# Subtract s2 from s1. Note the alignment.
result = s1.sub(s2)
print("Series 1:")
print(s1)
print("\nSeries 2:")
print(s2)
print("\nResult of s1.sub(s2):")
print(result)
Output:
Series 1:
a 100
b 200
c 300
dtype: int64
Series 2:
b 10
a 20
d 30
dtype: int64
Result of s1.sub(s2):
a 80.0 # 100 (s1['a']) - 20 (s2['a'])
b 190.0 # 200 (s1['b']) - 10 (s2['b'])
c NaN # No matching index in s2
d NaN # No matching index in s1
dtype: float64
Notice how a and b were correctly matched by index, and the result for c and d is NaN because there was no corresponding value in the other Series.

sub() on a DataFrame
You can perform subtraction between DataFrames and scalars, or between two DataFrames.
Example A: Subtracting a Scalar from a DataFrame
The scalar is subtracted from every element in the DataFrame.
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [10, 20, 30]
}, index=['row1', 'row2', 'row3'])
# Subtract 1 from every element
result_df = df.sub(1)
print("Original DataFrame:")
print(df)
print("\nDataFrame after subtracting 1:")
print(result_df)
Output:
Original DataFrame:
A B
row1 1 10
row2 2 20
row3 3 30
DataFrame after subtracting 1:
A B
row1 0 9
row2 1 19
row3 2 29
Example B: Subtracting Two DataFrames (Alignment is Key)
Just like with Series, pandas aligns DataFrames based on both index and column labels.
import pandas as pd
df1 = pd.DataFrame({
'A': [10, 20],
'B': [100, 200]
}, index=['x', 'y'])
df2 = pd.DataFrame({
'A': [1, 2],
'C': [5, 6] # Note different column name 'C'
}, index=['y', 'z'])
# Subtract df2 from df1
result_df = df1.sub(df2)
print("DataFrame 1:")
print(df1)
print("\nDataFrame 2:")
print(df2)
print("\nResult of df1.sub(df2):")
print(result_df)
Output:
DataFrame 1:
A B
x 10 100
y 20 200
DataFrame 2:
A C
y 1 5
z 2 6
Result of df1.sub(df2):
A B C
x NaN NaN NaN
y 19.0 195.0 NaN
z NaN NaN NaN
- The value at
('y', 'A')is20 - 1 = 19. - The value at
('y', 'B')is200 - NaN = NaN. - The value at
('x', 'A')is10 - NaN = NaN. - The value at
('z', 'C')isNaN - 6 = NaN.
Example C: Using the axis Parameter
The axis parameter is useful when you want to subtract a Series from a DataFrame along a specific axis.
axis=0(or'index'): The Series is treated as a row and subtracted from each row of the DataFrame.axis=1(or'columns'): The Series is treated as a column and subtracted from each column of the DataFrame.
import pandas as pd
df = pd.DataFrame({
'A': [10, 20, 30],
'B': [100, 200, 300]
}, index=['r1', 'r2', 'r3'])
# A Series to subtract
s = pd.Series([1, 2], index=['A', 'B'])
# 1. Subtract along the columns (axis=0)
# This subtracts the Series s from each row of df
result_axis0 = df.sub(s, axis=0)
print("Subtraction with axis=0 (row-wise):")
print(result_axis0)
# 2. Subtract along the rows (axis=1)
# This subtracts the Series s from each column of df
# Note: The Series index must match the DataFrame columns for this to work as expected.
result_axis1 = df.sub(s, axis=1)
print("\nSubtraction with axis=1 (column-wise):")
print(result_axis1)
Output:
Subtraction with axis=0 (row-wise):
A B
r1 9 98
r2 19 198
r3 29 298
Subtraction with axis=1 (column-wise):
A B
r1 9 98
r2 18 198
r3 28 298
axis=0: For each row, it doesdf['A'] - s['A']anddf['B'] - s['B'].axis=1: For each row, it broadcasts the Seriessand subtracts it element-wise from the row.
The fill_value Parameter
The fill_value parameter is very useful for avoiding NaN results when indices don't align. It provides a default value to use where one of the objects has a missing value.
import pandas as pd
s1 = pd.Series([100, 200, 300], index=['a', 'b', 'c'])
s2 = pd.Series([10, 20, 30], index=['b', 'a', 'd'])
# Without fill_value, we get NaNs
result_nan = s1.sub(s2)
print("Result without fill_value:")
print(result_nan)
# With fill_value=0, missing values are treated as 0
result_fill = s1.sub(s2, fill_value=0)
print("\nResult with fill_value=0:")
print(result_fill)
Output:
Result without fill_value:
a 80.0
b 190.0
c NaN
d NaN
dtype: float64
Result with fill_value=0:
a 90.0 # 100 - 20
b 190.0 # 200 - 10
c 300.0 # 300 - 0
d -30.0 # 0 - 30
dtype: float64
Summary: sub() vs. The Operator
For most simple cases, you can use the standard operator, and it will call the sub() method behind the scenes.
| Method | Syntax | Notes |
|---|---|---|
sub() method |
df.sub(other) |
More explicit. Allows for additional parameters like fill_value and level directly. |
| operator | df - other |
More concise and Pythonic. Perfect for simple cases. |
For example, df - 5 is functionally identical to df.sub(5). The choice is often a matter of style and whether you need the extra functionality of the sub() method.
