🌐
Baeldung
baeldung.com › home › java › core java › guide to escaping characters in java regexps
Guide to Escaping Characters in Java RegExps | Baeldung
July 22, 2024 - This means that in the previous example, we don’t want to let the pattern foo. to have a match in the input String. How would we handle a situation like this? The answer is that we need to escape the dot (.) character so that its special meaning is ignored. Let’s dig into it in more detail in the next section. According to the Java API documentation for regular expressions, there are two ways in which we can escape characters that have special meaning.
🌐
SSOJet
ssojet.com › escaping › regex-escaping-in-java
Regex Escaping in Java | Escaping Techniques in Programming
Consider finding lines that end with the literal string "end.". The appropriate Java regex string would be "end\\.$". The \. tells the regex engine to match a literal dot, and the $ anchors the match to the end of the line. A common gotcha is forgetting to escape the backslash for the Java string literal, leading to invalid regex patterns.
Top answer
1 of 8
43

I wrote this pattern:

CopyPattern SPECIAL_REGEX_CHARS = Pattern.compile("[{}()\\[\\].+*?^$\\\\|]");

And use it in this method:

CopyString escapeSpecialRegexChars(String str) {

    return SPECIAL_REGEX_CHARS.matcher(str).replaceAll("\\\\$0");
}

Then you can use it like this, for example:

CopyPattern toSafePattern(String text)
{
    return Pattern.compile(".*" + escapeSpecialRegexChars(text) + ".*");
}

We needed to do that because, after escaping, we add some regex expressions. If not, you can simply use \Q and \E:

CopyPattern toSafePattern(String text)
{
    return Pattern.compile(".*\\Q" + text + "\\E.*")
}
2 of 8
41

Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?

If you are looking for a way to create constants that you can use in your regex patterns, then just prepending them with "\\" should work but there is no nice Pattern.escape('.') function to help with this.

So if you are trying to match "\\d" (the string \d instead of a decimal character) then you would do:

Copy// this will match on \d as opposed to a decimal character
String matchBackslashD = "\\\\d";
// as opposed to
String matchDecimalDigit = "\\d";

The 4 slashes in the Java string turn into 2 slashes in the regex pattern. 2 backslashes in a regex pattern matches the backslash itself. Prepending any special character with backslash turns it into a normal character instead of a special one.

CopymatchPeriod = "\\.";
matchPlus = "\\+";
matchParens = "\\(\\)";
... 

In your post you use the Pattern.quote(string) method. This method wraps your pattern between "\\Q" and "\\E" so you can match a string even if it happens to have a special regex character in it (+, ., \\d, etc.)

🌐
Abareplace
abareplace.com › blog › escape-regexp
Which special characters must be escaped in regular expressions? — Aba Search & Replace
There is the Pattern.quote method for inserting a string into a regular expression. It surrounds the string with \Q and \E, which escapes multiple characters in Java regexes (borrowed from Perl).
🌐
Jenkov
jenkov.com › tutorials › java-regex › index.html
Java Regex - Java Regular Expressions
You can match non-word characters with the predefined character class [\W] (uppercase W). Since the \ character is also an escape character in Java, you need two backslashes in the Java string to get a \w in the regular expression.
🌐
Tabnine
tabnine.com › home page › code › java › java.util.regex.pattern
Java Examples & Tutorials of Pattern.escape (java.util.regex) | Tabnine
public static final String INVALID_CHARACTERS = "^#% {}|"; private static final Pattern INVALID_PATTERN = Pattern.compile("["+Pattern.escape(INVALID_CHARACTERS)+"]");
🌐
TutorialsPoint
tutorialspoint.com › java-program-to-illustrate-escaping-characters-in-regex
Java Program to Illustrate Escaping Characters in Regex
The primary method to escape special characters in Java regular expression is by using the backslash. However, since the backslash is also an escape character in Java strings, you need to use double backslashes (\) in your regex patterns.
Find elsewhere
🌐
Medium
medium.com › sina-ahmadi › java-regex-6e4d073aab85
Java RegEx. special characters issue in Java split… | by Sina Ahmadi | My journey as a software developer | Medium
June 20, 2018 - To escape a character in Java, you should use two backslashes “\\”. I have done the below steps to escape the asterisk character and fix this issue in my code: Replace all special characters using Java’s “replaceAll” method in the ...
🌐
JRebel
jrebel.com › blog › java-regular-expressions-cheat-sheet
Java Regular Expressions (Regex) Cheat Sheet | JRebel
A regular character in the Java Regex syntax matches that character in the text. If you'll create a Pattern with Pattern.compile("a") it will only match only the String "a". There is also an escape character, which is the backslash "\".
🌐
Oracle
docs.oracle.com › javase › 8 › docs › api › java › util › regex › Pattern.html
Pattern (Java Platform SE 8 )
October 20, 2025 - Unicode escape sequences such as \u2014 in Java source code are processed as described in section 3.3 of The Java™ Language Specification. Such escape sequences are also implemented directly by the regular-expression parser so that Unicode escapes can be used in expressions that are read from files or from the keyboard. Thus the strings "\u2014" and "\\u2014", while not equal, compile into the same pattern, which matches the character with hexadecimal value 0x2014.
🌐
O'Reilly
oreilly.com › library › view › java-9-regular › 9781787288706 › c7b9c597-5e7d-4822-be8b-7d4dc08a6c58.xhtml
Double escaping in a Java String when defining regular expressions - Java 9 Regular Expressions [Book]
July 25, 2017 - In Java, all the regular expressions are entered as a String type, where \ acts as an escape character and is used to interpret certain special characters such as \t, \n, and so on.
Author   Anubhava Srivastava
Published   2017
Pages   158
🌐
Regular-Expressions.info
regular-expressions.info › java.html
Using Regular Expressions in Java
In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a Java string, becomes "\\\\". That’s right: 4 backslashes to match a single one. The regex \w matches a word character.
🌐
GeeksforGeeks
geeksforgeeks.org › java › java-program-to-illustrate-escaping-characters-in-regex
Java Program to Illustrate Escaping Characters in Regex - GeeksforGeeks
September 30, 2021 - ... // Java Program to Illustrate Escaping Characters in Java // Regex Using \Q and \E for escaping // Importing required classes import java.io.*; import java.util.regex.*; // Main class class GFG { // Main driver method public static void ...
🌐
MojoAuth
mojoauth.com › escaping › regex-escaping-in-java
Regex Escaping in Java | Escaping Methods in Programming Languages
In Java, regex escaping involves using a backslash (``) before a special character to indicate that it should be treated literally. Common special characters that often require escaping include: ... To escape these characters in Java, you would ...
🌐
Oracle
docs.oracle.com › javase › 7 › docs › api › java › util › regex › Pattern.html
Pattern (Java Platform SE 7 )
Unicode escape sequences such as \u2014 in Java source code are processed as described in section 3.3 of The Java™ Language Specification. Such escape sequences are also implemented directly by the regular-expression parser so that Unicode escapes can be used in expressions that are read from files or from the keyboard. Thus the strings "\u2014" and "\\u2014", while not equal, compile into the same pattern, which matches the character with hexadecimal value 0x2014.
Top answer
1 of 2
3

There is no difference in the current scenario. The usual string escape sequences are formed with the help of a single backslash and then a valid escape char ("\n", "\r", etc.) and regex escape sequences are formed with the help of a literal backslash (that is, a double backslash in the Java string literal) and a valid regex escape char ("\\n", "\\d", etc.).

"\n" (an escape sequence) is a literal LF (newline) and "\\n" is a regex escape sequence that matches an LF symbol.

"\r" (an escape sequence) is a literal CR (carriage return) and "\\r" is a regex escape sequence that matches an CR symbol.

"\t" (an escape sequence) is a literal tab symbol and "\\t" is a regex escape sequence that matches a tab symbol.

See the list in the Java regex docs for the supported list of regex escapes.

However, if you use a Pattern.COMMENTS flag (used to introduce comments and format a pattern nicely, making the regex engine ignore all unescaped whitespace in the pattern), you will need to either use "\\n" or "\\\n" to define a newline (LF) in the Java string literal and "\\r" or "\\\r" to define a carriage return (CR).

See a Java test:

String s = "\n";
System.out.println(s.replaceAll("\n", "LF")); // => LF
System.out.println(s.replaceAll("\\n", "LF")); // => LF
System.out.println(s.replaceAll("(?x)\\n", "LF")); // => LF
System.out.println(s.replaceAll("(?x)\\\n", "LF")); // => LF
System.out.println(s.replaceAll("(?x)\n", "<LF>")); 
// => <LF>
//<LF>

Why is the last one producing <LF>+newline+<LF>? Because "(?x)\n" is equal to "", an empty pattern, and it matches an empty space before the newline and after it.

2 of 2
0

Yes there are different. The Java Compiler has different behavior for Unicode Escapes in the Java Book The Java Language Specification section 3.3;

The Java programming language specifies a standard way of transforming a program written in Unicode into ASCII that changes a program into a form that can be processed by ASCII-based tools. The transformation involves converting any Unicode escapes in the source text of the program to ASCII by adding an extra u - for example, \uxxxx becomes \uuxxxx - while simultaneously converting non- ASCII characters in the source text to Unicode escapes containing a single u each.

So how this affect the /n vs //n in the Java Doc:

It is therefore necessary to double backslashes in string literals that represent regular expressions to protect them from interpretation by the Java bytecode compiler.

An a example of the same doc:

The string literal "\b", for example, matches a single backspace character when interpreted as a regular expression, while "\b" matches a word boundary. The string literal "(hello)" is illegal and leads to a compile-time error; in order to match the string (hello) the string literal "\(hello\)" must be used.