Java scanner not going through entire file

Question

I'm writing a program in Java and one of the things that I need to do is to create a set of every valid location for a shortest path problem. The locations are defined in a .txt file that follows a strict pattern (one entry per line, no extra whitespace) and is perfect for using .nextLine to get the data. My problem is that 241 lines into the file (out of 432) the scanner stops working 3/4 of the way through an entry and doesn't recognize any new lines.

My code:

    //initialize state space
private static Set<String> posible(String posLoc) throws FileNotFoundException {
    Scanner s = new Scanner(new File(posLoc));
    Set<String> result = new TreeSet<String>();
    String availalbe;
    while(s.hasNextLine()) {
        availalbe = s.nextLine();
        result.add(availalbe);
    }
    s.close();
    return result;
}

The Data

Shenlong Gundam
Altron Gundam
Tallgee[scanner stops reading here]se
Tallgeese II
Leo (Ground)
Leo (Space)

Of course, "scanner stops reading here" is not in the data, I'm just marking where scanner stops reading the file. This is 3068 bytes into the file, but that shouldn't affect anything because in the same program, with nearly identical code, I'm reading a 261-line, 14KB .txt file that encodes the paths. Any help would be appreciated.

Thank you.

Hovercraft Full Of Eels · Accepted Answer

There's a problem with Scanner reading your file but I'm not sure what it is. It mistakenly believes that it's reached the end of file when it has not, possibly due to some funky String encoding. Try using a BufferedReader object that wraps a FileReader object instead.

e.g.,

   private static Set<String> posible2(String posLoc) {
      Set<String> result = new TreeSet<String>();
      BufferedReader br = null;
      try {
         br = new BufferedReader(new FileReader(new File(posLoc)));
         String availalbe;
         while((availalbe = br.readLine()) != null) {
             result.add(availalbe);            
         }
      } catch (FileNotFoundException e) {
         e.printStackTrace();
      } catch (IOException e) {
         e.printStackTrace();
      } finally {
         if (br != null) {
            try {
               br.close();
            } catch (IOException e) {
               e.printStackTrace();
            }
         }
      }
      return result;
  }

Edit
I tried reducing your problem to its bare minimum, and just this was enough to elicit the problem:

   public static void main(String[] args) {
      try {
         Scanner scanner = new Scanner(new File(FILE_POS));
         int count = 0;
         while (scanner.hasNextLine()) {
            String line = scanner.nextLine();
            System.out.printf("%3d: %s %n", count, line );
            count++;
         }

I checked the Scanner object with a printf:

System.out.printf("Str: %-35s size%5d; Has next line? %b%n", availalbe, result.size(), s.hasNextLine());

and showed that it thought that the file had ended. I was in the process of progressively deleting lines from the data to file to see which line(s) caused the problem, but will leave that to you.

Learner123 · Answer

I encountered the same problem and this is what I did to fix it:

1.Saved the file I was reading from into UTF-8
2.Created new Scanner like below, specifying the encoding type:


   Scanner scanner = new Scanner(new File("C:/IDSBRIEF/GuidData/"+sFileName),"UTF-8");

The Aa of Ron · Answer

I was having the same problem. The scanner would not read to the end of a file, actually stopping right in the middle of a word. I thought it was a problem with some limit set on the scanner, but I took note of the comment from rfeak about character encoding.

I re-saved the .txt I was reading into UTF-8, it solved the problem. It turns out that Notepad had defaulted to ANSI.

golimar · Answer

My case:

in my main program (A) it always reads 16384 bytes from a 41021 byte file. The character where it stops is in the middle of a line with normal printable text
if I create a small separate program (B) with only the Scanner and print lines, it reads the whole file
specifying "UTF-8" in (A) still reads 16384
specifying "ASCII" in (A) still reads 16384
specifying "Cp1252" in (A) reads the whole file
my input txt files are sent by users and I can't be sure that they will write them in any particular encoding

Conclusions

Scanner seems to read the file block by block and writes the correctly read data into the return String, but when it finds a block with a different encoding than it is expecting, it exits silently (ouch) and returns the partial string
the txt file I'm trying to read is Cp1252, my (A) source file is UTF-8 and my (B) source file is Cp1252 so that's why (B) worked without specifying an encoding

Solution

forget about Scanner and use

String fullFileContents = new String(Files.readAllBytes(myFile.toPath()));

Of course, non-ascii characters can't be reliably read like this as you don't know the encoding, but the ascii characters will be read for sure. Use it if you only need the ascii characters in the file and the non-ascii part can be discarded.

Java scanner not going through entire file

Tags:

java

file-io

java.util.scanner

Fizzmaister

4 Answers

Hovercraft Full Of Eels

Learner123

The Aa of Ron

golimar

Recent Activity

Donate For Us

Java scanner not going through entire file

Tags:

java

file-io

java.util.scanner

Fizzmaister

4 Answers

Hovercraft Full Of Eels

Learner123

The Aa of Ron

golimar

Related questions

Recent Activity

Donate For Us