文档

Java™ 教程-Java Tutorials 中文版
预定义的字符类
Trail: Essential Classes
Lesson: Regular Expressions

预定义的字符类

Pattern API 包含许多有用的 predefined character classes (预定义字符类),它们为常用的正则表达式提供了方便的缩写:

构造 描述
. 任何字符(可以匹配,也可以不匹配行终止符)
\d 数字:[0-9]
\D 非数字:[^0-9]
\s 空白字符:[ \t\n\x0B\f\r]
\S 非空白字符:[^\s]
\w 字符:[a-zA-Z_0-9]
\W 非单词字符:[^\w]

在上表中,左侧列中的每个构造都是右侧列中字符类的简写。例如,\d 表示数字范围(0-9),\w 表示单词字符(任何小写字母,任何大写字母,下划线字符,或任何数字)。尽可能使用预定义的类。它们使你的代码更易于阅读,并消除格式错误的字符类引入的错误。

以反斜杠开头的构造称为 escaped constructs (转义构造)。我们在 String Literals 部分中预览了转义结构,其中我们提到使用反斜杠和 \Q\E 作为引用。如果在字符串字面量中使用转义构造,则必须在反斜杠前面加上另一个反斜杠,以便编译字符串。例如:

 
private final String REGEX = "\\d"; // a single digit

在这个例子中,\d 是正则表达式;编译代码需要额外的反斜杠。但是,测试工具直接从 Console 读取表达式,因此不需要额外的反斜杠。

以下示例演示了预定义字符类的使用。

 
Enter your regex: .
Enter input string to search: @
I found the text "@" starting at index 0 and ending at index 1.

Enter your regex: . 
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.

Enter your regex: .
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \d
Enter input string to search: 1
I found the text "1" starting at index 0 and ending at index 1.

Enter your regex: \d
Enter input string to search: a
No match found.

Enter your regex: \D
Enter input string to search: 1
No match found.

Enter your regex: \D
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \s
Enter input string to search:  
I found the text " " starting at index 0 and ending at index 1.

Enter your regex: \s
Enter input string to search: a
No match found.

Enter your regex: \S
Enter input string to search:  
No match found.

Enter your regex: \S
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \w
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: \w
Enter input string to search: !
No match found.

Enter your regex: \W
Enter input string to search: a
No match found.

Enter your regex: \W
Enter input string to search: !
I found the text "!" starting at index 0 and ending at index 1.

在前三个示例中,正则表达式只是 .(“点”元字符),表示“任何字符”。因此,在所有三种情况下(随机选择的 @ 字符,数字和字母)匹配成功。其余示例均使用 Predefined Character Classes table 中的单个正则表达式构造。你可以参考此表来确定每个匹配背后的逻辑:

或者,大写字母意思相反:


Previous page: Character Classes
Next page: Quantifiers