杰瑞科技汇

Python列表如何高效去重?

Of course! In Python, there are several ways to get the unique elements from a list. The best method to use depends on whether you need to preserve the original order of the elements.

Python列表如何高效去重?-图1
(图片来源网络,侵删)

Here’s a breakdown of the most common and effective methods, from the most Pythonic to the most explicit.

Summary: Which Method to Use?

Method Preserves Order? Best For... Key Characteristic
list(dict.fromkeys(my_list)) Yes The best general-purpose, modern, and readable way. Most Pythonic and efficient for preserving order.
set(my_list) No When you don't need order and just want unique items. Fastest and most concise for simple uniqueness.
collections.OrderedDict Yes Python 2 compatibility or when you need an explicit object. A classic, slightly more verbose way to preserve order.
Manual Loop Yes When you need to perform other logic while checking for uniqueness. The most explicit and flexible approach.

Method 1: The Best & Most Pythonic Way (Preserves Order)

This is the recommended approach for modern Python (3.7+). It's concise, readable, and efficient.

It works by creating a dictionary from the list's items. Dictionary keys are inherently unique, and since Python 3.7, standard dictionaries preserve the insertion order.

my_list = [1, 2, 5, 2, 'a', 'b', 'a', 5]
# Create a dictionary from the list and convert its keys back to a list
unique_list = list(dict.fromkeys(my_list))
print(unique_list)
# Output: [1, 2, 5, 'a', 'b']

Why this works:

Python列表如何高效去重?-图2
(图片来源网络,侵删)
  1. dict.fromkeys(my_list) creates a dictionary where each element of my_list becomes a key.
    • Duplicates are automatically handled because keys must be unique.
    • The first occurrence of a value is kept, and subsequent ones are ignored.
  2. list(...) converts the keys of this new dictionary back into a list.
  3. In Python 3.7+, the order of keys in a dictionary is the same as the order they were inserted, so the original order is preserved.

Method 2: The Fastest Way (Does Not Preserve Order)

If you don't care about the original order of the elements, using a set is the most direct and performant method. A set is a data structure that only stores unique elements.

my_list = [1, 2, 5, 2, 'a', 'b', 'a', 5]
# Convert the list to a set to get unique elements, then back to a list
unique_list = list(set(my_list))
print(unique_list)
# Possible Output: [1, 2, 5, 'b', 'a']
# (Note: The order is not guaranteed!)

Important Caveat: Sets are inherently unordered collections. When you convert the set back to a list, the order of elements will be arbitrary. Do not use this method if the order of elements matters.


Method 3: The Classic Way (Preserves Order)

This method is similar to the first one but uses collections.OrderedDict. It's a great alternative, especially if you're working with older versions of Python (before 3.7) where standard dictionaries did not guarantee order.

from collections import OrderedDict
my_list = [1, 2, 5, 2, 'a', 'b', 'a', 5]
# Create an OrderedDict from the list and convert it to a list
unique_list = list(OrderedDict.fromkeys(my_list))
print(unique_list)
# Output: [1, 2, 5, 'a', 'b']

This works identically to the dict.fromkeys() method but is more explicit about the intention to preserve order.

Python列表如何高效去重?-图3
(图片来源网络,侵删)

Method 4: The Manual Loop (Most Explicit)

This approach is more verbose but can be useful if you need to perform other actions on the elements as you check for uniqueness. It's also a very clear and easy-to-understand method for beginners.

my_list = [1, 2, 5, 2, 'a', 'b', 'a', 5]
unique_list = []
seen = set() # Use a set for fast lookups
for item in my_list:
    if item not in seen:
        unique_list.append(item)
        seen.add(item)
print(unique_list)
# Output: [1, 2, 5, 'a', 'b']

Why this works:

  1. We initialize an empty unique_list to store our result and a seen set to keep track of items we've already encountered.
  2. We loop through the original my_list.
  3. For each item, we check if it's in the seen set. Checking for an item in a set is extremely fast (O(1) on average).
  4. If the item is not in seen, it's the first time we've encountered it, so we append it to unique_list and add it to the seen set.

Bonus: For Lists of Dictionaries or Objects

If your list contains complex objects like dictionaries, you can't directly use them as dictionary keys or set items because they are mutable (unhashable). You need to convert them into an immutable representation first (like a tuple of their items).

list_of_dicts = [
    {'id': 1, 'name': 'Alice'},
    {'id': 2, 'name': 'Bob'},
    {'id': 1, 'name': 'Alice'}, # Duplicate
    {'id': 3, 'name': 'Charlie'}
]
# Create a tuple of the dictionary items for each dictionary to make it hashable
unique_tuples = {tuple(d.items()) for d in list_of_dicts}
# Convert the set of tuples back to a list of dictionaries
unique_list_of_dicts = [dict(t) for t in unique_tuples]
print(unique_list_of_dicts)
# Output: [{'id': 1, 'name': 'Alice'}, {'id': 2, 'name': 'Bob'}, {'id': 3, 'name': 'Charlie'}]

Note: This method does not preserve order. To preserve order with complex objects, you would need to use the manual loop method (Method 4) and implement a custom equality check.

分享:
扫描分享到社交APP
上一篇
下一篇