Of course! java.util.regex is a powerful and essential package in Java for pattern matching in strings. It provides two main classes: Pattern and Matcher, which work together to find and manipulate text based on regular expressions.

Here’s a comprehensive guide covering the key concepts, classes, and practical examples.
The Core Components
The java.util.regex package revolves around three key interfaces/classes:
Pattern: A compiled representation of a regular expression. You don't instantiate this class directly. Instead, you use the staticPattern.compile()method to compile your regex string into aPatternobject. This is efficient because you can compile a pattern once and use it multiple times.Matcher: An engine that performs match operations on a character sequence (aStringorCharSequence) by interpreting aPattern. You get aMatcherobject by calling the.matcher()method on aPatternobject.PatternSyntaxException: An unchecked exception that indicates a syntax error in a regular expression pattern.
Basic Workflow
The typical workflow for using regex in Java is:
- Define your regular expression as a
String. - Compile the regex string into a
Patternobject usingPattern.compile(). - Create a
Matcherobject by calling.matcher()on thePattern, passing in the input string you want to search. - Perform matching operations (e.g., find, replace, split) using the
Matcherobject.
Key Matcher Methods
The Matcher class is where the action happens. Here are the most commonly used methods:

| Method | Description | Example |
|---|---|---|
boolean find() |
Attempts to find the next subsequence of the input sequence that matches the pattern. Returns true if found. |
while (m.find()) { ... } |
boolean matches() |
Attempts to match the entire region against the pattern. Returns true only if the whole string matches. |
if (m.matches()) { ... } |
boolean lookingAt() |
Attempts to match the beginning of the region against the pattern. Returns true if the start of the string matches. |
if (m.lookingAt()) { ... } |
String group() |
Returns the subsequence of the input sequence matched by the previous find() or matches() call. |
System.out.println(m.group()); |
String group(int group) |
Returns the input subsequence captured by the given group during the previous match. | System.out.println(m.group(1)); |
int start() / int end() |
Returns the start and end indices (exclusive) of the previous match. | System.out.println("Found at " + m.start() + "-" + m.end()); |
String replaceAll(String replacement) |
Replaces every subsequence of the input sequence that matches the pattern with the given replacement string. | String result = m.replaceAll("-"); |
String replaceFirst(String replacement) |
Replaces the first subsequence of the input sequence that matches the pattern. | String result = m.replaceFirst("-"); |
String[] split(CharSequence input) |
Splits the given input sequence around matches of this pattern. (This is a shortcut method on Pattern as well). |
String[] parts = m.split(" "); |
Practical Examples
Example 1: Finding All Occurrences
This example finds all words that start with "J" and are followed by any number of lowercase letters.
import java.util.regex.*;
public class RegexFindExample {
public static void main(String[] args) {
String text = "Java is great. JavaScript is also good. Just kidding!";
// Regex: \bJ[a-z]+\b
// \b - Word boundary
// J - The letter 'J'
// [a-z]+ - One or more lowercase letters
// \b - Word boundary
String regex = "\\bJ[a-z]+\\b";
// 1. Compile the regex
Pattern pattern = Pattern.compile(regex);
// 2. Create a matcher
Matcher matcher = pattern.matcher(text);
// 3. Find all matches
System.out.println("Words starting with 'J':");
while (matcher.find()) {
// matcher.group() returns the matched string
System.out.println("Found: " + matcher.group());
// matcher.start() and matcher.end() give the position
System.out.println(" at index: " + matcher.start() + " to " + matcher.end());
}
}
}
Output:
Words starting with 'J':
Found: Java
at index: 0 to 4
Found: JavaScript
at index: 15 to 25
Found: Just
at index: 40 to 44
Example 2: Extracting Groups (Capturing Parentheses)
This example parses a date string in YYYY-MM-DD format and extracts the year, month, and day.
import java.util.regex.*;
public class RegexGroupExample {
public static void main(String[] args) {
String dateText = "The event is on 2025-10-27 and another on 1999-01-01.";
// Regex: (\d{4})-(\d{2})-(\d{2})
// (\d{4}) - Group 1: Four digits (Year)
// - - A literal hyphen
// (\d{2}) - Group 2: Two digits (Month)
// - - A literal hyphen
// (\d{2}) - Group 3: Two digits (Day)
String regex = "(\\d{4})-(\\d{2})-(\\d{2})";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(dateText);
while (matcher.find()) {
// The entire match is group 0
System.out.println("Full date found: " + matcher.group(0));
// Group 1 is the first captured group (the parentheses)
String year = matcher.group(1);
String month = matcher.group(2);
String day = matcher.group(3);
System.out.println(" Year: " + year);
System.out.println(" Month: " + month);
System.out.println(" Day: " + day);
System.out.println("-------------------");
}
}
}
Output:
Full date found: 2025-10-27
Year: 2025
Month: 10
Day: 27
-------------------
Full date found: 1999-01-01
Year: 1999
Month: 01
Day: 01
-------------------
Example 3: Validation (matches() vs find())
It's crucial to understand the difference between matches() and find().
matches(): The entire string must match the pattern.find(): The pattern must be found somewhere in the string.
import java.util.regex.*;
public class RegexValidationExample {
public static void main(String[] args) {
String email = "test.user@example.com";
String invalidEmail = "this is not an email";
// Regex for a simple email validation
String regex = "[a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher1 = pattern.matcher(email);
Matcher matcher2 = pattern.matcher(invalidEmail);
// Using find() - will be true if the email pattern is anywhere in the string
System.out.println("Does '" + email + "' contain a valid email pattern? " + matcher1.find());
System.out.println("Does '" + invalidEmail + "' contain a valid email pattern? " + matcher2.find());
System.out.println("-----------------------------------");
// Using matches() - will only be true if the WHOLE string is a valid email
System.out.println("Is '" + email + "' a valid email? " + matcher1.matches());
System.out.println("Is '" + invalidEmail + "' a valid email? " + matcher2.matches());
}
}
Output:
Does 'test.user@example.com' contain a valid email pattern? true
Does 'this is not an email' contain a valid email pattern? false
-----------------------------------
Is 'test.user@example.com' a valid email? true
Is 'this is not an email' a valid email? false
Example 4: Replacing Text
This example removes all extra whitespace from a string.
import java.util.regex.*;
public class RegexReplaceExample {
public static void main(String[] args) {
String messyText = "This has multiple spaces.";
// Regex: \s+ - one or more whitespace characters
String regex = "\\s+";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(messyText);
// Replace every occurrence of one or more spaces with a single space
String cleanText = matcher.replaceAll(" ");
System.out.println("Original: '" + messyText + "'");
System.out.println("Cleaned: '" + cleanText + "'");
}
}
Output:
Original: 'This has multiple spaces.'
Cleaned: 'This has multiple spaces.'
Common Regex Patterns (Quick Reference)
| Pattern | Description | Example |
|---|---|---|
| Any character (except line terminator) | c.t matches "cat", "cut", "c8t" |
|
^ |
Start of the string | ^Hello matches "Hello world" |
| End of the string | world$ matches "Hello world" |
|
\d |
Any digit (0-9) | \d{3} matches "123" |
\D |
Any non-digit | \D+ matches "abc!" |
\w |
Word character (a-z, A-Z, 0-9, _) |
\w+ matches "user_name_123" |
\W |
Non-word character | \W matches "@" or " " |
\s |
Whitespace character (space, tab, newline) | \s+ matches " " or "\t" |
\S |
Non-whitespace character | \S+ matches "Java" |
[abc] |
Any character in the set (a, b, or c) | [aeiou] matches "a", "e", "i" |
[^abc] |
Any character NOT in the set | [^0-9] matches "a", "!" |
a\|b |
Either 'a' or 'b' | cat\|dog matches "cat" or "dog" |
| Zero or more of the preceding element | a* matches "", "a", "aa" |
|
| One or more of the preceding element | a+ matches "a", "aa" |
|
| Zero or one of the preceding element | colou?r matches "color" or "colour" |
|
{n} |
Exactly n times |
\d{3} matches exactly 3 digits |
{n,} |
At least n times |
\d{2,} matches 2 or more digits |
{n,m} |
Between n and m times |
\d{1,3} matches 1, 2, or 3 digits |
(abc) |
Capturing group | (test)+ matches "testtest" |
\b |
Word boundary | \bword\b matches "word" but not "sword" |
Performance Considerations
- Compile Once, Use Many: If you are using the same regular expression in a loop or multiple times, always compile it into a
Patternobject first. This is much faster than re-compiling the regex string on every iteration. - Use
String.split()for Simple Cases: For simple splitting,String.split(String regex)is a convenient shortcut. It internally compiles the regex and splits the string for you. For complex or repeated operations, usingPattern.split()is more explicit and efficient.
Learning Regular Expressions
Regular expressions are a skill in themselves. Here are some great resources to learn them:
- Regex101: An excellent online tool for building, testing, and learning regex. It provides explanations for each part of your pattern.
- RegExr: Another fantastic interactive tool with a helpful reference library.
- Oracle Java Documentation: The official documentation for
java.util.regexis always a good place to check for class details.
