I have a 132 kb file (you can't really say it's big) and I'm trying to read it from the Scala REPL, but I can't read past 2048 char because it gives me a java.nio.charset.MalformedInputException
exception
These are the steps I take:
val it = scala.io.Source.fromFile("docs/categorizer/usig_calles.json") // this is ok
it.take(2048).mkString // this is ok too
it.take(1).mkString // BANG!
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:338)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
Any idea what could be wrong?
--
Apparently the problem was that the file was not UTF encoded
I saved it as UTF and everything works, I just issue mkString on the iterator and it retrieves the whole contents of the file
The strange thing is that the error only aroused passing the first 2048 chars...
Cannot be certain without the file, but the documentation on the exception indicates it is thrown "when an input byte sequence is not legal for given charset, or an input character sequence is not a legal sixteen-bit Unicode sequence." (MalformedInputException javadoc)
I suspect that at 2049 is the first character encountered that is not valid with whatever the default JVM character encoding is in you environment. Consider explicitly stating the character encoding of the file using one of the overloads to fromFile.
If the application will be cross platform, you should know that the default character encoding on the JVM does vary by platform, so if you operate with a specific encoding you either want to explicitly set it as a command line parameter when launching of your application, or specify it at each call using the appropriate overload.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With