Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Scala crash when reading my CSV?

The file is here

http://dl.dropbox.com/u/12337149/history.csv

I try to read the data as follows

  for (line <- Source.fromFile(new File(file)).getLines) {
   println(line)
  }

I get the following error

Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:319)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
    at java.io.InputStreamReader.read(InputStreamReader.java:167)
    at java.io.BufferedReader.fill(BufferedReader.java:136)
    at java.io.BufferedReader.readLine(BufferedReader.java:299)
    at java.io.BufferedReader.readLine(BufferedReader.java:362)
    at scala.io.BufferedSource$BufferedLineIterator.<init>(BufferedSource.scala:32)
    at scala.io.BufferedSource.getLines(BufferedSource.scala:43)
    at com.alluvia.reports.RunIGConverter$$anonfun$main$1.apply(RunIGConverter.scala:17)
    at com.alluvia.reports.RunIGConverter$$anonfun$main$1.apply(RunIGConverter.scala:15)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
    at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:38)
    at com.alluvia.reports.RunIGConverter$.main(RunIGConverter.scala:15)
    at com.alluvia.reports.RunIGConverter.main(RunIGConverter.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

The file opens just fine in excel. I think it is some type of encoding issue but I do not know the work around

like image 402
deltanovember Avatar asked Aug 20 '11 07:08

deltanovember


1 Answers

I'd try the ISO8859_1 encoding, or Cp1252 if that doesn't work, as so:

Source.fromFile(new File(file), "ISO-8859-1").getLines()

You can see which encodings Sun Java supports here. I forget whether you're supposed to use the nio or io versions. (As you can see from my answer, which has used both.)

like image 105
Rex Kerr Avatar answered Oct 19 '22 23:10

Rex Kerr