Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regex "[.]" vs "."

Tags:

java

regex

I'm trying to use some regex in Java and I came across this when debugging my code.

What's the difference between [.] and .?

I was surprised that .at would match "cat" but [.]at wouldn't.

like image 673
GDanger Avatar asked Sep 05 '13 15:09

GDanger


People also ask

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

Is regex faster than for loop Java?

Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement.

What is regex \\ s in Java?

Understanding the \\s+ regex pattern in Java The Java regex pattern \\s+ is used to match multiple whitespace characters when applying a regex search to your specified value. The pattern is a modified version of \\s which is used to match a single whitespace character.

Is regex bad for performance?

In General, the Longer Regex Is the Better Regex Good regular expressions are often longer than bad regular expressions because they make use of specific characters/character classes and have more structure. This causes good regular expressions to run faster as they predict their input more accurately.


2 Answers

[.] matches a dot (.) literally, while . matches any character except newline (\n) (unless you use DOTALL mode).

You can also use \. ("\\." if you use java string literal) to literally match dot.

like image 159
falsetru Avatar answered Sep 29 '22 13:09

falsetru


The [ and ] are metacharacters that let you define a character class. Anything enclosed in square brackets is interpreted literally. You can include multiple characters as well:

[.=*&^$] // Matches any single character from the list '.','=','*','&','^','$'

There are two specific things you need to know about the [...] syntax:

  • The ^ symbol at the beginning of the group has a special meaning: it inverts what's matched by the group. For example, [^.] matches any character except a dot .
  • Dash - in between two characters means any code point between the two. For example, [A-Z] matches any single uppercase letter. You can use dash multiple times - for example, [A-Za-z0-9] means "any single upper- or lower-case letter or a digit".

The two constructs above (^ and -) are common to nearly all regex engines; some engines (such as Java's) define additional syntax specific only to these engines.

like image 30
Sergey Kalinichenko Avatar answered Sep 29 '22 12:09

Sergey Kalinichenko