Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx to detect if a line doesn't end in a semi colon

Tags:

regex

I'm trying to run through some code files and find lines that don't end in a semicolon.

I currently have this: ^(?:(?!;).)*$ from a bunch of Googling, and it works just fine. But now I want to expand on it so it ignores all the whitespace at the start or specific keywords like package or opening and closing braces.

The end goal is to take something like this:

package example
{
    public class Example
    {
        var i = 0

        var j = 1;

        // other functions and stuff
    }
}

And for the pattern to show me var i = 0 is missing a semi colon. That's just an example, the missing semi colon could be anywhere in class.

Any ideas? I've been fiddling for over an hour but no luck.

Thanks.

like image 669
Bruce Avatar asked Jun 10 '12 00:06

Bruce


People also ask

How do you match a semicolon in regex?

Semicolon is not in RegEx standard escape characters. It can be used normally in regular expressions, but it has a different function in HES so it cannot be used in expressions. As a workaround, use the regular expression standard of ASCII.

Is Colon special in regex?

A colon has no special meaning in Regular Expressions, it just matches a literal colon.

Which regex matches the end of line?

End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.

How do you use a colon in regex?

By placing - at the start or the end of the class, it matches the literal "-" . As mentioned in the comments by Keoki Zee, you can also escape the - inside the class, but most people simply add it at the end. You can also escape the hyphen with a backslash, [a\-z] .


4 Answers

Try this:

^\s*(?!package|public|class|//|[{}]).*(?<!;\s*)$

When tested in PowerShell:

PS> (gc file.txt) -match '^\s*(?!package|public|class|//|[{}]).*(?<!;\s*)$'
        var i = 0 
PS> 
like image 143
Damian Powell Avatar answered Sep 24 '22 16:09

Damian Powell


If you want a line that doesn't end in a semicolon you can ask for any amount anything .* followed by one character that isn't a semicolon [^;] followed possibly by some whitespace \s* by the end of the line $. So you have:

.*[^;]\s*$

Now if you don't want whitespace at the beginning you need to ask for the beginning of the line ^ followed by any character that isn't whitespace [^\s] followed by the regex from earlier:

^[^\s].*[^;]\s*$

If you don't want it to start with a keyword like package or, say, class, or whitespace you can ask for a character that isn't any of those three things. The regex that matches any of those three things is (?:\s|package|class) and the regex that matches anything except them them is (?!\s|package|class). Note the !. So you now have:

^(?!\s|package|class).*[^;]\s*$
like image 38
Eliot Ball Avatar answered Sep 23 '22 16:09

Eliot Ball


The key to capturing this complicated concept in a regex is to first understand how your regular expression engine/interpreter handles the following concepts:

  1. positive lookahead
  2. negative lookahead
  3. positive lookbehind
  4. negative lookbehind

Then you can begin to understand how to capture what you want, but only in such cases where what's ahead and what's behind is exactly as you specify.

str.scan(/^\s*(?=\S)(?!package.+\n|public.+\n|\/\/|\{|\})(.+)(?<!;)\s*$/)
like image 29
calvin Avatar answered Sep 24 '22 16:09

calvin


This is the regular expression line I'm using to highlight lines of Java code that don't end in semicolon and aren't one of the lines in java that aren't supposed to have a semicolon at the end... using vim's regular expression engine.

\(.\+[^; ]$\)\(^.*public.*\|.*//.*\|.*interface.*\|.*for.*\|.*class.*\|.*try.*\|^\s*if\s\+.*\|.*private.*\|.*new.*\|.*else.*\|.*while.*\|.*protected.*$\)\@<!
   ^          ^                                                                                                                                           ^
   |          |                                                                                                                 negative lookbehind feature 
   |          |
   |          2.  But not where such matches are preceeded by these keywords
   |
   |
   1. Group of at least some anychar preceeding a missing semicolon

Mnemonics for deciphering glyphs:

^          beginning of line
.*         Any amount of any char
+          at least one
[^ ... ]   everything but
$          end of line
\( ... \)  group
\|         delimiter
\@<!       negative lookbehind

Which roughly translates to:

Find me all lines that don't end in a semicolon and don't have any of the above keywords/expressions to the left of it. It's not perfect and probably doesn't hold up to obfuscated java, but for simple java programs it highlights the lines that should have semicolons at the end, but don't.

Image showing how this expression is working out for me:

enter image description here

Helpful link that helped me get the concepts I needed:

https://jbodah.github.io/blog/2016/11/01/positivenegative-lookaheadlookbehind-vim/

like image 30
Eric Leschinski Avatar answered Sep 21 '22 16:09

Eric Leschinski