I am reading a file which contains lots of information like shown below:
type dw_3 from u_dw within w_pg6p0012_01
boolean visible = false
integer x = 1797
integer y = 388
integer width = 887
integer height = 112
integer taborder = 0
boolean bringtotop = true
string dataobject = "d_pg6p0012_14"
end type
type dw_3 from u_dw within w_pg6p0012_01
integer x = 1797
integer y = 388
integer width = 887
integer height = 112
integer taborder = 0
boolean bringtotop = true
string dataobject = "d_pg6p0012_14"
end type
I made regex :(?i)type dw_\d\s+(.*?)\s+within(.*?)\s+(?!boolean visible = false)(.*)
I want to extract all the strings which do not contain "boolean visible = false"
but mine one is returning all.
I also tried many similar posts on stack but the result is similar to mine, please suggest a way.
solution :(?i)type\\s+dw_(\\d+|\\w+)\\s+from\\s+.*?within\\s+.*?\\s+(string|integer)?\\s+.*\\s+.*\\s+.*\\s+.*?\\s+.*?\\s+.*?\\s*string\\s+dataobject\\s+=\\s+(.*?)\\s+end\\s+type")
This is working well on regex checker but when i tried it on java it keep on running without giving any output
To represent this, we use a similar expression that excludes specific characters using the square brackets and the ^ (hat). For example, the pattern [^abc] will match any single character except for the letters a, b, or c.
Similarly, the negation variant of the character class is defined as "[^ ]" (with ^ within the square braces), it matches a single character which is not in the specified or set of possible characters. For example the regular expression [^abc] matches a single character except a or, b or, c.
In Java, "\b" is a back-space character (char 0x08 ), which when used in a regex will match a back-space literal.
How do you ignore something in regex? To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself.
You can use this RegEx
(\s*boolean visible = false)|(.*)
DEMO
This basically defines 2 capture groups
First capture group (\s*boolean visible = false)
will catch boolean visible = false
.
Second Capture group (.*)
will capture everything else except all that's capture by first capture group.
Now when you're extracting it, just capture second group and ignore first one.
Edit
Here's an example for clarification:
In this example,
See the output, which is without that line boolean visible = false
.
Output
type dw_3 from u_dw within w_pg6p0012_01
integer x = 1797
integer y = 388
integer width = 887
integer height = 112
integer taborder = 0
boolean bringtotop = true
string dataobject = "d_pg6p0012_14"
end type
type dw_3 from u_dw within w_pg6p0012_01
integer x = 1797
integer y = 388
integer width = 887
integer height = 112
integer taborder = 0
boolean bringtotop = true
string dataobject = "d_pg6p0012_14"
end type
Java Implementation
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTut3 {
public static void main(String args[]) {
String file = getOriginalFileContents();
Pattern pattern = Pattern.compile("(\\s*boolean visible = false)|(.*)");
Matcher matcher = pattern.matcher(file);
while (matcher.find()) {
//System.out.print(matcher.group(1)); //ignore this group
if (matcher.group(2) != null) System.out.println(matcher.group(2));
}
}
//this method just get's the file contents as displayed in the
//question.
private static String getOriginalFileContents() {
String s = " type dw_3 from u_dw within w_pg6p0012_01\n" +
" boolean visible = false\n" +
" integer x = 1797\n" +
" integer y = 388\n" +
" integer width = 887\n" +
" integer height = 112\n" +
" integer taborder = 0\n" +
" boolean bringtotop = true\n" +
" string dataobject = \"d_pg6p0012_14\"\n" +
" end type\n" +
" \n" +
" type dw_3 from u_dw within w_pg6p0012_01\n" +
" integer x = 1797\n" +
" integer y = 388\n" +
" integer width = 887\n" +
" integer height = 112\n" +
" integer taborder = 0\n" +
" boolean bringtotop = true\n" +
" string dataobject = \"d_pg6p0012_14\"\n" +
" end type";
return s;
}
}
It will be much easier (and more readable) if you make a regex to match "boolean visible = false"
and then exclude those lines that contain a match for it.
Pattern pattern = Pattern.compile("boolean visible = false");
Files.lines(filepath)
.filter(line -> !pattern.matcher(line).find()) // note the "!"
.forEach(/* do stuff */);
Notes:
Files#lines(String)
, it is not necessary to break apart separate lines in the regex. This is already done for us.Matcher#find()
method returns whether the given character sequence
contains a match for the regex anywhere in it. I believe this is what you want.EDIT:
Now, if you are just really intent on using a pure regex, then try this:
^((?!boolean visible = false).)+$
This will match an entire (non-empty) line if-and-only-if it does not contain "boolean visible = false"
anywhere within it. No fancy backreferences / capture group semantics needed to extract the desired text.
See proof by unit tests here: https://regex101.com/r/dbzdMB/1
EDIT #2:
Alternatively, if all you are trying to do is to get the file text without any "boolean visible = false"
, then you could simply replace every instance of that target string with the empty string.
Pattern pattern = Pattern.compile("boolean visible = false");
Matcher matcher = pattern.matcher(fileAsCharSequence); // e.g. StringBuilder
String output = matcher.replaceAll("");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With