Java 判断字符串是否是汉字
在Java中,判断一个字符串是否完全由汉字组成,或者判断字符串中是否包含汉字,可以通过以下几种方法实现:

使用正则表达式
public class ChineseCharacterChecker {
public static boolean isAllChinese(String str) {
if (str == null || str.isEmpty()) {
return false;
}
return str.matches("^[\\u4e00-\\u9fa5]+$");
}
public static boolean containsChinese(String str) {
if (str == null || str.isEmpty()) {
return false;
}
return str.matches(".*[\\u4e00-\\u9fa5].*");
}
public static void main(String[] args) {
String test1 = "汉字测试";
String test2 = "Hello汉字";
String test3 = "Hello123";
System.out.println(isAllChinese(test1)); // true
System.out.println(isAllChinese(test2)); // false
System.out.println(isAllChinese(test3)); // false
System.out.println(containsChinese(test1)); // true
System.out.println(containsChinese(test2)); // true
System.out.println(containsChinese(test3)); // false
}
}
使用Character类
public class ChineseCharacterChecker {
public static boolean isAllChinese(String str) {
if (str == null || str.isEmpty()) {
return false;
}
for (char c : str.toCharArray()) {
if (!isChinese(c)) {
return false;
}
}
return true;
}
public static boolean containsChinese(String str) {
if (str == null || str.isEmpty()) {
return false;
}
for (char c : str.toCharArray()) {
if (isChinese(c)) {
return true;
}
}
return false;
}
private static boolean isChinese(char c) {
Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
return ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B
|| ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
|| ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS
|| ub == Character.UnicodeBlock.GENERAL_PUNCTUATION;
}
public static void main(String[] args) {
String test1 = "汉字测试";
String test2 = "Hello汉字";
String test3 = "Hello123";
System.out.println(isAllChinese(test1)); // true
System.out.println(isAllChinese(test2)); // false
System.out.println(isAllChinese(test3)); // false
System.out.println(containsChinese(test1)); // true
System.out.println(containsChinese(test2)); // true
System.out.println(containsChinese(test3)); // false
}
}
使用Apache Commons Lang库
如果你使用Apache Commons Lang库,可以使用StringUtils类:
import org.apache.commons.lang3.StringUtils;
import org.apache.commons.lang3.CharUtils;
public class ChineseCharacterChecker {
public static boolean isAllChinese(String str) {
if (str == null || str.isEmpty()) {
return false;
}
for (char c : str.toCharArray()) {
if (!CharUtils.isAscii(c) && !isChinese(c)) {
return false;
}
}
return true;
}
private static boolean isChinese(char c) {
Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
return ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B
|| ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
|| ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS
|| ub == Character.UnicodeBlock.GENERAL_PUNCTUATION;
}
public static void main(String[] args) {
String test1 = "汉字测试";
String test2 = "Hello汉字";
String test3 = "Hello123";
System.out.println(isAllChinese(test1)); // true
System.out.println(isAllChinese(test2)); // false
System.out.println(isAllChinese(test3)); // false
}
}
注意事项
- 汉字的Unicode范围大致在
\u4e00到\u9fa5之间,但这个范围并不完全覆盖所有汉字 - 上述方法中的
isChinese方法考虑了更多的Unicode块,可以识别更广泛的汉字字符 - 如果你需要更精确的汉字判断,可以考虑使用专门的汉字识别库或数据库
- 对于性能敏感的场景,正则表达式方法可能比循环遍历字符更高效
选择哪种方法取决于你的具体需求和项目环境,正则表达式方法通常是最简洁的,而Character类方法则提供了更细粒度的控制。

