杰瑞科技汇

java hashset string

Of course! Let's dive deep into using String objects with HashSet in Java. This is a very common and important topic because it highlights some of the most fundamental concepts of the Java Collections Framework.

java hashset string-图1
(图片来源网络,侵删)

Here’s a comprehensive guide covering:

  1. The Basics: How to use a HashSet of Strings.
  2. The Crucial Concept: How HashSet works with Strings (.equals() and .hashCode()).
  3. A Detailed Example: Putting it all together.
  4. Common Pitfalls: What happens if you get it wrong?
  5. When to Use It: Best practices.

The Basics: Creating and Using a HashSet<String>

A HashSet is a collection that does not allow duplicate elements. It provides constant-time performance (O(1)) for basic operations like add(), remove(), and contains(), assuming the hash function disperses the elements well.

To use a HashSet for String objects, you simply specify the generic type String.

import java.util.HashSet;
import java.util.Set;
public class HashSetOfStringExample {
    public static void main(String[] args) {
        // Create a HashSet that will store String objects
        Set<String> uniqueFruits = new HashSet<>();
        // 1. Add elements using the add() method
        System.out.println("Adding fruits to the set...");
        uniqueFruits.add("Apple");
        uniqueFruits.add("Banana");
        uniqueFruits.add("Orange");
        uniqueFruits.add("Apple"); // Adding a duplicate
        System.out.println("Set contents: " + uniqueFruits);
        // Output: Set contents: [Orange, Apple, Banana]
        // Note: The order is not guaranteed! HashSet does not maintain insertion order.
        System.out.println("------------------------------------");
        // 2. Check if an element exists using contains()
        boolean hasApple = uniqueFruits.contains("Apple");
        System.out.println("Does the set contain 'Apple'? " + hasApple); // true
        boolean hasGrape = uniqueFruits.contains("Grape");
        System.out.println("Does the set contain 'Grape'? " + hasGrape); // false
        System.out.println("------------------------------------");
        // 3. Remove an element using remove()
        System.out.println("Removing 'Banana'...");
        uniqueFruits.remove("Banana");
        System.out.println("Set contents after removal: " + uniqueFruits); // [Orange, Apple]
        System.out.println("------------------------------------");
        // 4. Get the size of the set
        System.out.println("Size of the set: " + uniqueFruits.size()); // 2
    }
}

The Crucial Concept: How HashSet Works with Strings

This is the most important part to understand. A HashSet is not a magic box; it uses two methods from the objects it stores to determine uniqueness and to find them quickly:

java hashset string-图2
(图片来源网络,侵删)
  1. public int hashCode()
  2. public boolean equals(Object obj)

How HashSet Uses Them (The "Hashing" Process)

When you add an element to a HashSet (e.g., uniqueFruits.add("Apple")), this is what happens:

  1. Calculate the Hash Code: The HashSet calls the hashCode() method on the object you're adding ("Apple"). This method returns an integer. Think of this as a "bucket number." The String class has a highly optimized hashCode() implementation that is based on the characters in the string. Crucially, if two strings are equal, their hash codes must be equal.
  2. Find the Bucket: The HashSet uses this hash code to determine which internal "bucket" to place the object in. Objects with the same hash code go into the same bucket.
  3. Check for Duplicates (The equals() part):
    • If the bucket is empty, the new object is simply placed in it.
    • If the bucket is not empty, the HashSet must check if the object you're adding is a duplicate of any object already in that bucket. It does this by iterating through the objects in the bucket and calling the equals() method on each one.
    • If newObject.equals(existingObject) returns true for any object in the bucket, the HashSet concludes it's a duplicate and does not add the new object.
    • If the loop finishes and no equals() comparison returned true, the new object is added to the bucket.

Why is this important for String?

The good news is: You don't have to do anything special!

The String class in Java is immutable and has correctly implemented both equals() and hashCode().

  • s1.equals(s2) returns true only if s1 and s2 contain the exact same sequence of characters.
  • s1.hashCode() is calculated in a way that guarantees that if s1.equals(s2) is true, then s1.hashCode() == s2.hashCode().

This contract is essential for HashSet and HashMap to work correctly.

java hashset string-图3
(图片来源网络,侵删)

A Detailed Example: Seeing hashCode() and equals() in Action

Let's create a custom class that violates the contract and see what happens, then compare it to the String class.

Scenario A: The Correct Way (Using String)

public class StringCorrectness {
    public static void main(String[] args) {
        Set<String> stringSet = new HashSet<>();
        String s1 = new String("hello");
        String s2 = new String("hello");
        String s3 = new String("world");
        System.out.println("s1.hashCode(): " + s1.hashCode()); // Will be the same for s1 and s2
        System.out.println("s2.hashCode(): " + s2.hashCode());
        System.out.println("s3.hashCode(): " + s3.hashCode());
        System.out.println("s1.equals(s2): " + s1.equals(s2)); // true
        System.out.println("s1.equals(s3): " + s1.equals(s3)); // false
        stringSet.add(s1);
        stringSet.add(s2); // This will be correctly identified as a duplicate
        stringSet.add(s3);
        System.out.println("Final Set Size: " + stringSet.size()); // Correctly 2
        System.out.println("Final Set Contents: " + stringSet); // [hello, world] (or [world, hello])
    }
}

Output:

s1.hashCode(): 993927575
s2.hashCode(): 993927575
s3.hashCode(): 4323805
s1.equals(s2): true
s1.equals(s3): false
Final Set Size: 2
Final Set Contents: [world, hello]

As you can see, because s1 and s2 are equal and have the same hash code, s2 was correctly rejected as a duplicate.


Common Pitfalls (And Why String Avoids Them)

Imagine if you had a mutable class and you changed an object after it was put into a HashSet.

// WARNING: THIS IS BROKEN CODE. DO NOT DO THIS.
public class MutableBrokenExample {
    public static void main(String[] args) {
        Set<MutablePerson> people = new HashSet<>();
        MutablePerson p1 = new MutablePerson("Alice", 30);
        people.add(p1);
        System.out.println("Set contains Alice (before change): " + people.contains(p1)); // true
        // Now, we change the state of the object that is IN the set
        p1.setName("Alice Smith");
        System.out.println("Set contains Alice (after change): " + people.contains(p1)); // false!
        System.out.println("Set size: " + people.size()); // Still 1, but the element is "lost"
    }
}
class MutablePerson {
    private String name;
    private int age;
    public MutablePerson(String name, int age) {
        this.name = name;
        this.age = age;
    }
    // Getters and Setters...
    public void setName(String name) { this.name = name; }
    public String getName() { return name; }
    // ... equals() and hashCode() are typically based on 'name'
    @Override public boolean equals(Object o) { /* ... based on name ... */ }
    @Override public int hashCode() { /* ... based on name ... */ }
}

Why this breaks:

  1. You add p1 ("Alice") to the set. Its hash code is calculated based on "Alice".
  2. You change p1's name to "Alice Smith".
  3. Now, the object inside the set has a different state than the reference p1 outside.
  4. When you call people.contains(p1), it calculates the hash code for "Alice Smith". This points to a different bucket than the one where the object (still internally "Alice") is stored. It can't find it!

Why String is safe:

分享:
扫描分享到社交APP
上一篇
下一篇