Of course! In Java, handling strings that represent URLs is a common task. The standard library provides a powerful class specifically for this purpose, which is much more robust than simple string manipulation.

Here’s a comprehensive guide covering everything from basic representation to advanced parsing and encoding.
The Modern Approach: java.net.URI and java.net.URL
For any serious URL handling, you should use the classes from the java.net package. They are designed to correctly parse, construct, and manipulate URLs according to internet standards (RFC 3986).
java.net.URI vs. java.net.URL
It's important to know the difference:
-
URI(Uniform Resource Identifier): This is a parser and formatter. Its job is to break a URL string into its components (scheme, host, path, query, etc.) or build one from components. It does not guarantee that the resource actually exists. It's purely for syntactic manipulation. This is the class you should use 90% of the time.
(图片来源网络,侵删) -
URL(Uniform Resource Locator): This is a pointer. It extendsURIand adds the ability to connect to the resource it points to (e.g., open a stream to download a file). It's a higher-level concept that implies network access. UseURLwhen you actually need to fetch content.
Parsing a URL into its Components (using URI)
This is the most common use case. You have a URL string and you want to extract its parts.
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
public class UriExample {
public static void main(String[] args) {
String urlString = "https://www.example.com:8080/path/to/resource?name=John+Doe&age=30#section1";
try {
// 1. Create a URI object from the string
URI uri = new URI(urlString);
// 2. Extract the components
String scheme = uri.getScheme(); // "https"
String host = uri.getHost(); // "www.example.com"
int port = uri.getPort(); // 8080 (-1 if not specified or default)
String path = uri.getPath(); // "/path/to/resource"
String query = uri.getQuery(); // "name=John+Doe&age=30"
String fragment = uri.getFragment(); // "section1"
System.out.println("Original URL: " + urlString);
System.out.println("---------------------------------");
System.out.println("Scheme: " + scheme);
System.out.println("Host: " + host);
System.out.println("Port: " + (port == -1 ? "(default)" : port));
System.out.println("Path: " + path);
System.out.println("Query: " + query);
System.out.println("Fragment: " + fragment);
} catch (URISyntaxException e) {
System.err.println("Invalid URL syntax: " + e.getMessage());
}
}
}
Output:
Original URL: https://www.example.com:8080/path/to/resource?name=John+Doe&age=30#section1
---------------------------------
Scheme: https
Host: www.example.com
Port: 8080
Path: /path/to/resource
Query: name=John+Doe&age=30
Fragment: section1
Constructing a URL from Components (using URI)
You can also build a URL string from its individual parts. This is much safer than concatenating strings.
import java.net.URI;
import java.net.URISyntaxException;
public class UriBuilderExample {
public static void main(String[] args) {
String scheme = "https";
String host = "api.example.com";
String path = "/v1/users";
String query = "sort=asc&limit=10";
try {
// The constructor takes components in order: scheme, host, path, query, fragment
URI uri = new URI(scheme, host, path, query, null);
// The toString() method gives you the well-formed URL string
String constructedUrl = uri.toString();
System.out.println("Constructed URL: " + constructedUrl);
} catch (URISyntaxException e) {
System.err.println("Failed to construct URI: " + e.getMessage());
}
}
}
Output:
Constructed URL: https://api.example.com/v1/users?sort=asc&limit=10
URL Encoding and Decoding (CRITICAL!)
URLs can only contain a limited set of characters. Special characters (like spaces, &, , ) must be "percent-encoded" (e.g., a space becomes %20).
The java.net.URLEncoder and java.net.URLDecoder classes handle this. Crucially, URLEncoder is for form data (query parameters), not for encoding the entire URL path.
Encoding Query Parameters
Use URLEncoder for key-value pairs in the query string.
import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
public class UrlEncodingExample {
public static void main(String[] args) {
String name = "John Doe";
String city = "New York";
String value = "special&chars?here";
try {
// Encode each parameter separately
String encodedName = URLEncoder.encode(name, StandardCharsets.UTF_8.name());
String encodedCity = URLEncoder.encode(city, StandardCharsets.UTF_8.name());
String encodedValue = URLEncoder.encode(value, StandardCharsets.UTF_8.name());
// Now, construct the URL
String baseUrl = "https://example.com/search";
String query = String.format("name=%s&city=%s&value=%s", encodedName, encodedCity, encodedValue);
String finalUrl = baseUrl + "?" + query;
System.out.println("Original Name: " + name);
System.out.println("Encoded Name: " + encodedName); // "John+Doe" (or "John%20Doe")
System.out.println("Final URL: " + finalUrl);
} catch (UnsupportedEncodingException e) {
// This should not happen with StandardCharsets.UTF_8
e.printStackTrace();
}
}
}
Output:
Original Name: John Doe
Encoded Name: John+Doe
Final URL: https://example.com/search?name=John+Doe&city=New+York&value=special%26chars%3Fhere
Notice how &, , and spaces are correctly encoded for use in a query string.
Decoding Query Parameters
Use URLDecoder to reverse the process.
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
public class UrlDecodingExample {
public static void main(String[] args) {
String encodedQuery = "name=John+Doe&city=New+York&value=special%26chars%3Fhere";
try {
// URLDecoder expects a "+" to be a space, which is correct for form data
String decodedQuery = URLDecoder.decode(encodedQuery, StandardCharsets.UTF_8.name());
System.out.println("Encoded Query: " + encodedQuery);
System.out.println("Decoded Query: " + decodedQuery);
// You would then parse this "name=John Doe&city=New York&value=special&chars?here"
// string into a map, etc.
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
}
}
Making a Network Request (using URL)
When you need to actually access the resource, you use the URL class to open a connection.
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
public class UrlConnectionExample {
public static void main(String[] args) {
// Use a reliable public API for this example
String urlString = "https://api.github.com/users/octocat";
try {
// 1. Create a URL object
URL url = new URL(urlString);
// 2. Open a connection to the URL
// url.openConnection() returns a URLConnection, which is a generic type.
// For HTTP/HTTPS, it's usually an HttpURLConnection.
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(url.openStream()))) {
// 3. Read the response line by line
String line;
StringBuilder response = new StringBuilder();
while ((line = reader.readLine()) != null) {
response.append(line);
}
System.out.println("Response from GitHub API:");
System.out.println(response.toString());
}
} catch (IOException e) {
System.err.println("Error connecting to URL: " + e.getMessage());
}
}
}
The Old Way: String Manipulation (Avoid This!)
Before URI and URL were robust, developers sometimes tried to handle URLs with String methods. This is brittle and error-prone. It does not handle edge cases like complex paths, encoded characters, or different schemes correctly.
Example of what NOT to do:
// BAD PRACTICE - DO NOT DO THIS
String url = "https://www.example.com/path/to/resource?name=John+Doe";
String path = url.substring(url.indexOf("www.example.com") + "www.example.com".length());
// This is fragile and will break with many real-world URLs.
Summary: When to Use Which
| Task | Recommended Class | Why? |
|---|---|---|
| Parse a URL string into components. | java.net.URI |
It's a parser, not a network client. It's fast and safe. |
| Build a URL string from components. | java.net.URI |
Correctly assembles parts and handles encoding/escaping. |
| Encode/decode query parameters. | URLEncoder/URLDecoder |
Designed specifically for application/x-www-form-urlencoded data. |
| Fetch content from a URL (HTTP GET). | java.net.URL |
Represents a network locator and can open connections. |
| Validate if a string is a well-formed URL. | java.net.URI |
The constructor will throw URISyntaxException if it's not. |
