杰瑞科技汇

Python Series排序方法有哪些?

Of course! Sorting a series in Python is a very common task, and the best tool for the job is the Pandas library. Pandas provides powerful and flexible methods to sort Series objects.

Python Series排序方法有哪些?-图1
(图片来源网络,侵删)

Here’s a comprehensive guide covering the most common sorting scenarios.

The Setup: Creating a Pandas Series

First, let's make sure you have Pandas installed and create a sample Series to work with.

# If you don't have pandas installed, run this in your terminal:
pip install pandas

Now, in your Python script or notebook:

import pandas as pd
import numpy as np # For creating missing values (NaN)
# Create a sample Series
data = [10, 30, 20, 50, 40, np.nan, 15]
labels = ['c', 'a', 'f', 'b', 'e', 'd', 'g']
my_series = pd.Series(data, index=labels)
print("Original Series:")
print(my_series)

Output:

Python Series排序方法有哪些?-图2
(图片来源网络,侵删)
Original Series:
c    10.0
a    30.0
f    20.0
b    50.0
e    40.0
d     NaN
g    15.0
dtype: float64

Sorting by Values (.sort_values())

This is the most common method. It sorts the Series based on its numerical values.

Basic Sort (Ascending Order)

By default, .sort_values() sorts in ascending order (ascending=True).

sorted_ascending = my_series.sort_values()
print("\nSeries sorted by values (Ascending):")
print(sorted_ascending)

Output:

Series sorted by values (Ascending):
d     NaN
c    10.0
g    15.0
f    20.0
a    30.0
e    40.0
b    50.0
dtype: float64

Note: By default, NaN (missing) values are placed at the beginning of the sorted Series.

Sort in Descending Order

To sort from largest to smallest, set ascending=False.

sorted_descending = my_series.sort_values(ascending=False)
print("\nSeries sorted by values (Descending):")
print(sorted_descending)

Output:

Series sorted by values (Descending):
b    50.0
e    40.0
a    30.0
f    20.0
g    15.0
c    10.0
d     NaN
dtype: float64

Handling Missing Values (na_position)

You can control where the NaN values appear using the na_position parameter. It can be 'first' (default) or 'last'.

# Place NaN values at the end
sorted_nan_last = my_series.sort_values(ascending=True, na_position='last')
print("\nSeries sorted with NaN at the end (Ascending):")
print(sorted_nan_last)

Output:

Series sorted with NaN at the end (Ascending):
c    10.0
g    15.0
f    20.0
a    30.0
e    40.0
b    50.0
d     NaN
dtype: float64

Sorting by Index (.sort_index())

Sometimes you want to sort the Series based on its index (the labels) instead of its values.

Basic Index Sort (Ascending)

sorted_index_ascending = my_series.sort_index()
print("\nSeries sorted by index (Ascending):")
print(sorted_index_ascending)

Output:

Series sorted by index (Ascending):
a    30.0
b    50.0
c    10.0
d     NaN
e    40.0
f    20.0
g    15.0
dtype: float64

Sort Index in Descending Order

sorted_index_descending = my_series.sort_index(ascending=False)
print("\nSeries sorted by index (Descending):")
print(sorted_index_descending)

Output:

Series sorted by index (Descending):
g    15.0
f    20.0
e    40.0
d     NaN
c    10.0
b    50.0
a    30.0
dtype: float64

In-Place Sorting vs. Returning a New Series

It's crucial to understand the difference between these two behaviors:

  • Default (Returns a new Series): Most methods in Pandas, including .sort_values() and .sort_index(), return a new, sorted Series and leave the original Series unchanged. This is generally safer as it prevents accidental data modification.

    sorted_series = my_series.sort_values()
    print("\nOriginal Series (unchanged):")
    print(my_series)
    print("\nNew Sorted Series:")
    print(sorted_series)
  • In-Place (Modifies the original): If you want to modify the original Series directly, you can use the inplace=True parameter. This method returns None.

    print("\nOriginal Series before in-place sort:")
    print(my_series)
    # This modifies my_series directly
    my_series.sort_values(ascending=False, inplace=True)
    print("\nOriginal Series after in-place sort:")
    print(my_series)

Advanced Sorting: Sorting by a Key Function

What if you have a Series of strings and you want to sort them in a custom way, like ignoring case?

You can use the key argument in .sort_values(). The key argument accepts a function that is applied to each value before the comparison is made.

# Create a Series of strings with mixed case
fruits = pd.Series(['Apple', 'banana', 'cherry', 'apricot', 'Blueberry'])
print("Original Fruits Series:")
print(fruits)
# Sort by the lowercase version of each string
sorted_fruits = fruits.sort_values(key=lambda x: x.str.lower())
print("\nFruits Series sorted case-insensitively:")
print(sorted_fruits)

Output:

Original Fruits Series:
0       Apple
1      banana
2      cherry
3     apricot
4    Blueberry
dtype: object
Fruits Series sorted case-insensitively:
3     apricot
0       Apple
1      banana
4    Blueberry
2      cherry
dtype: object

Summary of Key Methods

Method Description Common Parameters
series.sort_values() Sorts the Series by its values. ascending=True/False, na_position='first'/'last', inplace=True/False, key=function
series.sort_index() Sorts the Series by its index. ascending=True/False, inplace=True/False

Complete Example: Putting It All Together

import pandas as pd
import numpy as np
# 1. Create Data
data = {'Sales': [500, 230, 800, 150, 900, None], 'Product': ['A', 'C', 'B', 'A', 'D', 'C']}
df = pd.DataFrame(data)
series_to_sort = df['Sales']
print("--- Original Series ---")
print(series_to_sort)
# 2. Sort by values (Ascending)
print("\n--- Sorted by Values (Ascending) ---")
print(series_to_sort.sort_values())
# 3. Sort by values (Descending, NaNs at the end)
print("\n--- Sorted by Values (Descending, NaN last) ---")
print(series_to_sort.sort_values(ascending=False, na_position='last'))
# 4. Sort by index
print("\n--- Sorted by Index ---")
print(series_to_sort.sort_index())
# 5. In-place sort (modifies the original Series from the DataFrame)
print("\n--- Original DataFrame before in-place sort ---")
print(df)
df['Sales'].sort_values(ascending=True, inplace=True)
print("\n--- Original DataFrame after in-place sort ---")
print(df)

This guide should cover almost all your needs for sorting a Pandas Series. Remember that .sort_values() is your go-to for sorting by data, and .sort_index() is for sorting by labels.

分享:
扫描分享到社交APP
上一篇
下一篇