Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex - Trimming whitespace from start and end of line [closed]

Tags:

regex

Link: http://regexone.com/example/5

It asks: Write a simple regular expression to capture the content of each line, without the extra whitespace.

What I have is a mess with a bunch of \S+, is there an elegant way to solve this problem?

like image 427
sojim Avatar asked Apr 30 '13 06:04

sojim


People also ask

How do I remove whitespace from start and end of string?

String result = str. trim(); The trim() method will remove both leading and trailing whitespace from a string and return the result.

How do I trim a whitespace in regex?

Trimming WhitespaceSearch for [ \t]+$ to trim trailing whitespace. Do both by combining the regular expressions into ^[ \t]+|[ \t]+$. Instead of [ \t] which matches a space or a tab, you can expand the character class into [ \t\r\n] if you also want to strip line breaks. Or you can use the shorthand \s instead.

What is leading and trailing whitespace?

Leading spaces (at the start of the text), Trailing spaces (at the end of the text) and Non-Breaking spaces (prevents line breaks from occurring at a particular point) usually get in the way when we want to perform operations in Excel.

How do I get rid of leading white space?

The lstrip() method will remove leading whitespaces, newline and tab characters on a string beginning.


1 Answers

Writing regular expressions may seem like a black art, but it's actually quite simple; the most important step is to identify with surgical precision exactly what you do and do not want to match, then say just what you mean (no more and no less).

Another tip: when using * or + qualifiers, especially with "wildcard" characters like ., always remember that part of the regex may "run past" the part which you wanted to match, perhaps matching the entire string. Often, the simplest solution is to use a reluctant qualifier like *? or +? instead. (The most common regexp bugs are those which make the regexp match when you didn't want it to or more than you wanted to.)

In this case, you want "the content of each line, without extra whitespace". That's not quite precise enough. What is "extra whitespace"? Trailing and leading whitespace? If so...

Let's express that in completely precise, non-ambiguous terms. What you basically have is:

  1. A region of whitespace characters (possibly empty)
  2. Either: a) Nothing. b) A single non-whitespace character. c) A region which starts and ends with non-whitespace characters.
  3. A region of whitespace characters (possibly empty)

Can you express that as a regex? Try doing so and posting it here, then I'll give you some feedback.

like image 166
Alex D Avatar answered Sep 18 '22 13:09

Alex D