Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Doesn't This Java Code Skip Lines with #?

Tags:

java

file-io

I'm a bit of a newbee but I'm trying to allow an external .txt file that is read by a Java script be able to have some comments in the beginning of the file so others can easily edit it and add more to it. But if the file contains # (the sign designated for a line that is a comment) it just returns the error that there is a "Format Error in file" (the IOException - so it is getting past that first "IF"...) Can someone help?

Here's the portion of the code that deals with commenting lines out of the .txt file being called earlier in the script:

   while ((line = br.readLine()) != null) {
    line = line.trim();
    if (line.length() < 1 || line.charAt(0) == '#') { // ignore comments
     continue;
    }
    final String[] parts = line.split("=");
    if (parts.length != 2) {
     throw new IOException("Format error in file "
       + JLanguageTool.getDataBroker().getFromRulesDirAsUrl(getFileName())
       + ", line: " + line);
    }

The input.txt file breaks it at the first line:

#This is a Test
ឲ្យ|ឱ្យ=អោយ
កំពស់=កម្ពស់
កម្នាញ់=កំណាញ់

And here is the actual error:

Caused by: java.io.IOException: Format error in file

file:/D:/Documents......./coherency.txt, line: #This is a Test at rules.km.KhmerSimpleReplaceRule.loadWords(KhmerSimpleReplaceRule.java:165) at rules.km.KhmerSimpleReplaceRule.loadWords(KhmerSimpleReplaceRule.java:82) ...33 more

And the stack trace error:

Caused by: java.io.IOException: Format error in file [Ljava.lang.StackTraceElement;@1cb2795 at km.KhmerSimpleReplaceRule.loadWords(KhmereSimpleReplaceRule.java: 169)

like image 303
Nathan Avatar asked Jan 13 '11 08:01

Nathan


4 Answers

There may be a UTF-8 Byte Order Mark in front of your first visible character. Most Editors will not show these characters since the only predict the encoding of the content and Java doesn't remove the UTF-8 byte order mark(unlike for UTF-16 and 32). If there really is an UTF-8 BOM you'll have to remove these three bytes yourself.

For more details see Java-Bug 6378911.

like image 136
tigger Avatar answered Nov 15 '22 13:11

tigger


This should have worked unless there are white spaces. You can try this code.

if (line.trim().startsWith("#") { // ignore comments
   continue;
}
like image 30
fastcodejava Avatar answered Nov 15 '22 11:11

fastcodejava


That should work unless the # isn't actually the first non-whitespace character on the line (or you have a non-comment line somewhere with either no or more than one = in it).

I can only suggest you show us the entire exception which will include the actual offending line in it. You might also want to make it:

+ ", line: [" + line + "]");

so you're sure there's no leading or trailing spaces. In addition, output line.codePointAt(0) in the exception as well - it may be a language/wrong-Unicode-code problem.

You might also consider making your code more flexible to allow comments at the ends of lines as well. That's a simple matter of stripping out everything from the first # to the end of the line before the trim, and will allow things like:

password = xyzzy # super sekrit sauce from zork
like image 33
paxdiablo Avatar answered Nov 15 '22 11:11

paxdiablo


Your code seems correct at first... I can see several options:

  1. You can read the Properties file properly if it is actually a properties file.
  2. You have an error in a line following the comment.
  3. The # is not the first character of that given line.

A stacktrace and an input file might help...

like image 22
Lukas Eder Avatar answered Nov 15 '22 13:11

Lukas Eder