Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex find catch blocks without log

I am using regex with PowerGrep to search through a bunch of files. I am working with java files and my goal is to find all catch blocks that do not contain the word log within the block so that I can add logging. There are alot of files, so going through them manually isn't really feasible.

Examples of what should be found

catch (Exception e) {
    //comment#
    int math = 1 +2 * (3);
    String email = "[email protected]";
    anothermethod.call();
    //no logging
}  

and

catch(AnotherException e ) {}    //no logging

Examples of what should NOT be found

catch(AnotherException e ) {  
     //some code
     log.error("Error message");
     //some more code 
}

and

catch(BadE_xception e) { log.error(e); }      

I am not very experienced with regex, but this is what I have so far:

start of catch block: catch\s*\(\s*\w*\s+\w*\s*\)\s*\{.*?

but then I am not sure where to go from there to specify not contain log. If you have ideas on how to do this without regex, that works perfect for me as well. Thanks

like image 570
jlars62 Avatar asked Jul 10 '13 19:07

jlars62


1 Answers

You can get a finite level of nested cases, at least.

For the no-nested case, modifying the end of your expression:

catch\s*\(\s*\w*\s+\w*\s*\)\s*\{(?:[^}](?!\blog\b))*\}
                                ^^^^^^^^^^^^^^^^^^^^^^

Let's break this down.

  1. We're strictly looking at non-} characters; hence [^}]. Once we find the first }, we're done.
  2. The (?!foo) is called a negative lookahead assertion. It means, "This point is not followed by foo."
  3. The \b is a word-boundary. Surrounding log in \bs ensures that we don't catch "false positives" like "clog" and "logical". You want the sole word, "log".
  4. The (?:foo) is a way to group an expression without capturing. This isn't important—for now pretend it's the same as (foo). Its purpose is so that the whole group can be quantified by the *.
  5. Putting it all together: we are checking character by character, each one not being a }, and each one not being followed by the whole word, log.

That ensures that the word log is nowhere within the non-nested catch block.

Now, moving onto the nested cases. As @TimPietzcker pointed out, PowerGREP doesn't support recursive expressions yet, but for your purposes you may be satisfied with a finite number of nestings. Here's the expression for one level of nesting:

catch\s*\(\s*\w*\s+\w*\s*\)\s*\{(?:[^{}](?!\blog\b)|\{(?:[^}](?!\blog\b))*\})*\}
                                     ^             ^========================

We've added the { character to the class of characters we don't like. This is because if we encounter this character, we want to switch over via alternation (|) to the nested case, which, as you can see by comparing the part underlined by = signs, is an exact copy of the original "inner" expression. You can continue to nest this way as much as you'd like, to capture an arbitrary number of balanced nestings.


Here's the template for 10 levels of nesting, which should be sufficient for most applications of this sort.

catch\s*\(\s*\w*\s+\w*\s*\)\s*\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED|\{(?:SEED)*\})*\})*\})*\})*\})*\})*\})*\})*\})*\})*\}

where SEED is the recursion seed, [^{}](?!\blog\b). I've written it this way so it's visually easier to remove or add recursions as desired. Expanded, the above becomes:

catch\s*\(\s*\w*\s+\w*\s*\)\s*\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b)|\{(?:[^{}](?!\blog\b))*\})*\})*\})*\})*\})*\})*\})*\})*\})*\})*\}
like image 95
slackwing Avatar answered Oct 04 '22 05:10

slackwing