I'm trying to run through some code files and find lines that don't end in a semicolon.
I currently have this: ^(?:(?!;).)*$
from a bunch of Googling, and it works just fine. But now I want to expand on it so it ignores all the whitespace at the start or specific keywords like package or opening and closing braces.
The end goal is to take something like this:
package example
{
public class Example
{
var i = 0
var j = 1;
// other functions and stuff
}
}
And for the pattern to show me var i = 0
is missing a semi colon. That's just an example, the missing semi colon could be anywhere in class.
Any ideas? I've been fiddling for over an hour but no luck.
Thanks.
Semicolon is not in RegEx standard escape characters. It can be used normally in regular expressions, but it has a different function in HES so it cannot be used in expressions. As a workaround, use the regular expression standard of ASCII.
A colon has no special meaning in Regular Expressions, it just matches a literal colon.
End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.
By placing - at the start or the end of the class, it matches the literal "-" . As mentioned in the comments by Keoki Zee, you can also escape the - inside the class, but most people simply add it at the end. You can also escape the hyphen with a backslash, [a\-z] .
Try this:
^\s*(?!package|public|class|//|[{}]).*(?<!;\s*)$
When tested in PowerShell:
PS> (gc file.txt) -match '^\s*(?!package|public|class|//|[{}]).*(?<!;\s*)$'
var i = 0
PS>
If you want a line that doesn't end in a semicolon you can ask for any amount anything .*
followed by one character that isn't a semicolon [^;]
followed possibly by some whitespace \s*
by the end of the line $
. So you have:
.*[^;]\s*$
Now if you don't want whitespace at the beginning you need to ask for the beginning of the line ^
followed by any character that isn't whitespace [^\s]
followed by the regex from earlier:
^[^\s].*[^;]\s*$
If you don't want it to start with a keyword like package
or, say, class
, or whitespace you can ask for a character that isn't any of those three things. The regex that matches any of those three things is (?:\s|package|class)
and the regex that matches anything except them them is (?!\s|package|class)
. Note the !
. So you now have:
^(?!\s|package|class).*[^;]\s*$
The key to capturing this complicated concept in a regex is to first understand how your regular expression engine/interpreter handles the following concepts:
Then you can begin to understand how to capture what you want, but only in such cases where what's ahead and what's behind is exactly as you specify.
str.scan(/^\s*(?=\S)(?!package.+\n|public.+\n|\/\/|\{|\})(.+)(?<!;)\s*$/)
This is the regular expression line I'm using to highlight lines of Java code that don't end in semicolon and aren't one of the lines in java that aren't supposed to have a semicolon at the end... using vim's regular expression engine.
\(.\+[^; ]$\)\(^.*public.*\|.*//.*\|.*interface.*\|.*for.*\|.*class.*\|.*try.*\|^\s*if\s\+.*\|.*private.*\|.*new.*\|.*else.*\|.*while.*\|.*protected.*$\)\@<!
^ ^ ^
| | negative lookbehind feature
| |
| 2. But not where such matches are preceeded by these keywords
|
|
1. Group of at least some anychar preceeding a missing semicolon
Mnemonics for deciphering glyphs:
^ beginning of line
.* Any amount of any char
+ at least one
[^ ... ] everything but
$ end of line
\( ... \) group
\| delimiter
\@<! negative lookbehind
Which roughly translates to:
Find me all lines that don't end in a semicolon and don't have any of the above keywords/expressions to the left of it. It's not perfect and probably doesn't hold up to obfuscated java, but for simple java programs it highlights the lines that should have semicolons at the end, but don't.
Image showing how this expression is working out for me:
Helpful link that helped me get the concepts I needed:
https://jbodah.github.io/blog/2016/11/01/positivenegative-lookaheadlookbehind-vim/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With