Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to match trailing whitespace, but not lines which are entirely whitespace (indent placeholders)

I've been trying to construct a ruby regex which matches trailing spaces - but not indentation placeholders - so I can gsub them out.

I had this /\b[\t ]+$/ and it was working a treat until I realised it only works when the line ends are [a-zA-Z]. :-( So I evolved it into this /(?!^[\t ]+)[\t ]+$/ and it seems like it's getting better, but it still doesn't work properly. I've spent hours trying to get this to work to no avail. Please help.

Here's some text test so it's easy to throw into Rubular, but the indent lines are getting stripped so it'll need a few spaces and/or tabs. Once lines 3 & 4 have spaces back in, it shouldn't match on lines 3-5, 7, 9.

some test test  
some test test      


  some other test (text)
  some other test (text)  
  likely here{ dfdf }
  likely here{ dfdf }        
  and this ;
  and this ;  

Alternatively, is there an simpler / more elegant way to do this?

like image 823
tjmcewan Avatar asked Apr 19 '10 15:04

tjmcewan


2 Answers

If you're using 1.9, you can use look-behind:

/(?<=\S)[\t ]+$/

but unfortunately, it's not supported in older versions of ruby, so you'll have to handle the captured character:

str.gsub(/(\S)[\t ]+$/) { $1 }
like image 67
mckeed Avatar answered Nov 01 '22 07:11

mckeed


Your first expression is close, and you just need to change the \b to a negated character class. This should work better:

/([^\t ])[\t ]+$

In plain words, this matches all tabs and spaces on lines that follow a character that is not a tab or a space.

like image 1
Mike Pelley Avatar answered Nov 01 '22 07:11

Mike Pelley