Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read UTF8 encoded file using RandomAccessFile?

I have text file that was encoded with UTF8 (for language specific characters). I need to use RandomAccessFile to seek specific position and read from.

I want read line-by-line.

String str = myreader.readLine(); //returns wrong text, not decoded 
String str myreader.readUTF(); //An exception occurred: java.io.EOFException
like image 555
kenny Avatar asked Apr 01 '12 13:04

kenny


People also ask

Can UTF-8 be read as ASCII?

UTF-8 is not a character set but an encoding used with Unicode. It happens to be compatible with ASCII too, because the codes used for multiple byte encodings lie in the part of the ASCII character set that is unused.

Does UTF-8 use 8bits?

UTF-8 is an 8-bit variable width encoding. The first 128 characters in the Unicode, when represented with UTF-8 encoding have the representation as the characters in ASCII.

What is RandomAccessFile?

RandomAccessFile(File file, String mode) Creates a random access file stream to read from, and optionally to write to, the file specified by the File argument. RandomAccessFile(String name, String mode) Creates a random access file stream to read from, and optionally to write to, a file with the specified name.


1 Answers

You can convert string, read by readLine to UTF8, using following code:

public static void main(String[] args) throws IOException {
    RandomAccessFile raf = new RandomAccessFile(new File("MyFile.txt"), "r");
    String line = raf.readLine();
    String utf8 = new String(line.getBytes("ISO-8859-1"), "UTF-8");
    System.out.println("Line: " + line);
    System.out.println("UTF8: " + utf8);
}

Content of MyFile.txt: (UTF-8 Encoding)

Привет из Украины

Console output:

Line: ÐÑÐ¸Ð²ÐµÑ Ð¸Ð· УкÑаинÑ
UTF8: Привет из Украины
like image 90
picoworm Avatar answered Oct 18 '22 21:10

picoworm