I've been trying to figure out a regex to allow me to search for a particular string while automatically skipping comments. Anyone have an RE like this or know of one? It doesn't even need to be sophisticated enough to skip #if 0
blocks; I just want it to skip over //
and /*
blocks. The converse, that is only search inside comment blocks, would be very useful too.
Environment: VS 2003
This is a harder problem than it might at first appear, since you need to consider comment tokens inside strings, comment tokens that are themselves commented out etc.
I wrote a string and comment parser for C#, let me see if I can dig out something that will help... I'll update if I find anything.
EDIT: ... ok, so I found my old 'codemasker' project. Turns out that I did this in stages, not with a single regex. Basically I inch through a source file looking for start tokens, when I find one I then look for an end-token and mask everything in between. This takes into account the context of the start token... if you find a token for "string start" then you can safely ignore comment tokens until you find the end of the string, and vice versa. Once the code is masked (I used guids as masks, and a hashtable to keep track) then you can safely do your search and replace, then finally restore the masked code.
Hope that helps.
Be especially careful with strings. Strings often have escape sequences which you also have to respect while you're finding the end of them.
So e.g. "This is \"a test\""
. You cannot blindly look for a double-quote to terminate. Also beware of ``"This is \"`, which shows that you cannot just say "unless double-quote is preceded by a backslash."
In summary, make some brutal unit tests!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With