Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the regular expression for matching that contains no white space in between text?

Tags:

string

regex

I am looking for a regular expression for matching that contains no white space in between text but it may or may not have white space at start or end of text.

like image 950
Tasawer Khan Avatar asked Jun 11 '10 06:06

Tasawer Khan


People also ask

How do you restrict whitespace in regex?

\s is the regex character for whitespace. It matches spaces, new lines, carriage returns, and tabs.

What is a non-whitespace character in regex?

The \S metacharacter matches non-whitespace characters. Whitespace characters can be: A space character. A tab character. A carriage return character.

Which modifier ignores white space in regex?

Turn on free-spacing mode to ignore whitespace between regex tokens and allow # comments, both inside and outside character classes.

Do not include space in regex?

A regex to match a string that does not contain only whitespace characters can be written in two ways. The first involves using wildcards (.) and the non-whitespace character set (\S), while the other one involves a positive lookahead to check that the string contains at least one non-whitespace character (\S).


1 Answers

You want something like this: (see it in action on rubular.com):

^\s*\S+\s*$

Explanation:

  • ^ is the beginning of the string anchor
  • $ is the end of the string anchor
  • \s is the character class for whitespace
  • \S is the negation of \s (note the upper and lower case difference)
  • * is "zero-or-more" repetition
  • + is "one-or-more" repetition

References

  • regular-expressions.info/Anchors, Character Classes and Repetition

Can the "text" part be empty?

The above pattern does NOT match, say, an empty string. The original specification isn't very clear if this is the intended behavior, but if an empty "text" is allowed, then simply use \S* instead, i.e. match zero-or-more (instead of one-or-more) repetition of \S.

Thus, this pattern (same as above except * is used instead of +)

^\s*\S*\s*$

will match:

  • The empty string (i.e. the string whose length is 0)
  • Non-empty strings consisting of nothing but whitespace characters

What counts as "text" characters?

The above patterns use \S to define the "text" characters, i.e. anything but whitespace. This includes things like punctuations and symbols, i.e. the string " #@^$^* " matches both patterns. It's not clear if this is the desired behavior, i.e. it's possible that " ==== AWESOMENESS ==== " is a desired match

The pattern still works even for this case, we simply need to be more specific with our character class definitions.

For example, this pattern:

/^[^a-z]*[a-z]*[^a-z]*$/i

Will match (as seen on rubular.com):

 ==== AWESOMENESS ====

But not:

 ==== NOT AWESOME ==== 

Note that the ^ metacharacter, when used as the first character in a character class definition, no longer means the beginning of the string anchor, but rather a negation of the character class definition.

Note also the use of the /i modifier in the pattern: this enables case insensitive matching. The actual syntax may vary between languages/flavors.

References

  • regular-expressions.info/Modifiers
  • java.util.regex.Pattern.CASE_INSENSITIVE -- embedded flag is (?i)
like image 164
polygenelubricants Avatar answered Nov 13 '22 17:11

polygenelubricants