I'm a bit of a newbee but I'm trying to allow an external .txt file that is read by a Java script be able to have some comments in the beginning of the file so others can easily edit it and add more to it. But if the file contains # (the sign designated for a line that is a comment) it just returns the error that there is a "Format Error in file" (the IOException - so it is getting past that first "IF"...) Can someone help?
Here's the portion of the code that deals with commenting lines out of the .txt file being called earlier in the script:
while ((line = br.readLine()) != null) {
line = line.trim();
if (line.length() < 1 || line.charAt(0) == '#') { // ignore comments
continue;
}
final String[] parts = line.split("=");
if (parts.length != 2) {
throw new IOException("Format error in file "
+ JLanguageTool.getDataBroker().getFromRulesDirAsUrl(getFileName())
+ ", line: " + line);
}
The input.txt file breaks it at the first line:
#This is a Test ឲ្យ|ឱ្យ=អោយ កំពស់=កម្ពស់ កម្នាញ់=កំណាញ់
And here is the actual error:
Caused by: java.io.IOException: Format error in file
file:/D:/Documents......./coherency.txt, line: #This is a Test at rules.km.KhmerSimpleReplaceRule.loadWords(KhmerSimpleReplaceRule.java:165) at rules.km.KhmerSimpleReplaceRule.loadWords(KhmerSimpleReplaceRule.java:82) ...33 more
And the stack trace error:
Caused by: java.io.IOException: Format error in file [Ljava.lang.StackTraceElement;@1cb2795 at km.KhmerSimpleReplaceRule.loadWords(KhmereSimpleReplaceRule.java: 169)
There may be a UTF-8 Byte Order Mark in front of your first visible character. Most Editors will not show these characters since the only predict the encoding of the content and Java doesn't remove the UTF-8 byte order mark(unlike for UTF-16 and 32). If there really is an UTF-8 BOM you'll have to remove these three bytes yourself.
For more details see Java-Bug 6378911.
This should have worked unless there are white spaces. You can try this code.
if (line.trim().startsWith("#") { // ignore comments
continue;
}
That should work unless the #
isn't actually the first non-whitespace character on the line (or you have a non-comment line somewhere with either no or more than one =
in it).
I can only suggest you show us the entire exception which will include the actual offending line in it. You might also want to make it:
+ ", line: [" + line + "]");
so you're sure there's no leading or trailing spaces. In addition, output line.codePointAt(0)
in the exception as well - it may be a language/wrong-Unicode-code problem.
You might also consider making your code more flexible to allow comments at the ends of lines as well. That's a simple matter of stripping out everything from the first #
to the end of the line before the trim, and will allow things like:
password = xyzzy # super sekrit sauce from zork
Your code seems correct at first... I can see several options:
#
is not the first character of that given line.A stacktrace and an input file might help...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With