Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java.nio.charset.MalformedInputException when reading a stream

I use the following code to read data. It throws java.nio.charset.MalformedInputException. The file I can open normally, but it does include non-ascii chars. Anyway I can fix this problem?

  Source.fromInputStream(stream).getLines foreach { line =>
    // store items on the fly
    lineParser(line.trim) match {
      case None => // no-op
      case Some(pair) => // some-op
    }   
  }   
  stream.close()

The stream construction code is here:

def getStream(path: String) = {
  if (!fileExists(path)) {
    None
  } else {
    val fileURL = new URL(path)
    val urlConnection = fileURL.openConnection
    Some(urlConnection.getInputStream())
  }
}
like image 838
user398384 Avatar asked Jul 30 '11 19:07

user398384


2 Answers

Try Source.fromInputStream(stream)(io.Codec("UTF-8")) or whatever charset you need.

like image 186
huynhjl Avatar answered Sep 28 '22 22:09

huynhjl


Jean-Laurent is likely completely right that Stream.fromInputStream is using an encoding that doesn't match your stream—likely the platform default, i.e. ISO8859-1 on Windows, UTF-8 on recent Linux distros, IIUC MacRoman on Macs... Since you got an encoding exception, it's likely that it was defaulting to UTF-8—since it's a fairly rigid scheme—and the file was some other encoding (most likely ISO8859-1).

Broadly, there's no way to tell a priori what character encoding was used to generate some bitstream—you need some out-of-band mechanism to communicate it. In the case of HTTP responses, you can often get it from the Content-Type header, but various web apps do it wrong sometimes. If the file is XML, it's common to claim an encoding in the Processing Instruction at the top. Some file formats specify a single standard encoding... It's all over the map really.

Your best bet, in the absence of any integration requirement, is to use UTF-8 explicitly everywhere, and don't rely on the platform default encoding.

like image 45
Alex Cruise Avatar answered Sep 28 '22 23:09

Alex Cruise