Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex's For Developers

I've been trying to figure out a regex to allow me to search for a particular string while automatically skipping comments. Anyone have an RE like this or know of one? It doesn't even need to be sophisticated enough to skip #if 0 blocks; I just want it to skip over // and /* blocks. The converse, that is only search inside comment blocks, would be very useful too.

Environment: VS 2003

like image 908
Onorio Catenacci Avatar asked Mar 02 '23 08:03

Onorio Catenacci


2 Answers

This is a harder problem than it might at first appear, since you need to consider comment tokens inside strings, comment tokens that are themselves commented out etc.

I wrote a string and comment parser for C#, let me see if I can dig out something that will help... I'll update if I find anything.

EDIT: ... ok, so I found my old 'codemasker' project. Turns out that I did this in stages, not with a single regex. Basically I inch through a source file looking for start tokens, when I find one I then look for an end-token and mask everything in between. This takes into account the context of the start token... if you find a token for "string start" then you can safely ignore comment tokens until you find the end of the string, and vice versa. Once the code is masked (I used guids as masks, and a hashtable to keep track) then you can safely do your search and replace, then finally restore the masked code.

Hope that helps.

like image 179
Ed Guiness Avatar answered Mar 05 '23 16:03

Ed Guiness


Be especially careful with strings. Strings often have escape sequences which you also have to respect while you're finding the end of them.

So e.g. "This is \"a test\"". You cannot blindly look for a double-quote to terminate. Also beware of ``"This is \"`, which shows that you cannot just say "unless double-quote is preceded by a backslash."

In summary, make some brutal unit tests!

like image 25
Jason Cohen Avatar answered Mar 05 '23 16:03

Jason Cohen