Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read/write .txt file with special characters

I open Notepad (Windows) and write

Some lines with special characters
Special: Žđšćč

and go to Save As... "someFile.txt" with Encoding set to UTF-8.

In Java I have

FileInputStream fis = new FileInputStream(new File("someFile.txt"));
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader in = new BufferedReader(isr);

String line;
while((line = in.readLine()) != null) {                         
    printLine(line);
}
in.close();

But I get question marks and similar "special" characters. Why?

EDIT: I have this input (one line in .txt file)

665,Žđšćč

and this code

FileInputStream fis = new FileInputStream(new File(fileName));
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader in = new BufferedReader(isr);

String line;
while((line = in.readLine()) != null) {
    Toast.makeText(mContext, line, Toast.LENGTH_LONG).show();

    Pattern p = Pattern.compile(",");
    String[] article = p.split(line);

    Toast.makeText(mContext, article[0], Toast.LENGTH_LONG).show();
    Toast.makeText(mContext, Integer.parseInt(article[0]), Toast.LENGTH_LONG).show();
}
in.close();

And Toast output (for ones who aren't familiar with Android, Toast is just a method to show a pop-up on screen with particular text in it) is fine. Console shows "weird characters" (probably because of encoding in console window). But it fails at parsing an integer because console says this (warning: toast output is just fine) - Problem?

It seems like the String is containing some "weird" characters which Toast can't show/render but when I try to parse it, it crashes. Suggestions?

If I put ANSI in NotePad it works (integer parsing) and there are no weird chars as in the picture above, but of course my special characters aren't working.

like image 937
svenkapudija Avatar asked Jan 04 '11 19:01

svenkapudija


People also ask

How do you show special characters in a text file?

Go to View Menu > Select Show Symbol > Select Show All Characters . It displays all hidden characters in the opened file.

Do TXT files support Unicode?

txt) file is saved in an appropriate Unicode formatting. Saving a plain text document file as Unicode will allow you to use the text across multiple platforms and systems with minimal formatting changes.

What is Unicode TXT file?

"Unicode"-encoded Microsoft Windows text files contain text in UTF-16 Unicode Transformation Format. Such files normally begin with Byte Order Mark (BOM), which communicates the endianness of the file content.

Is TXT an ASCII?

ASCII files are plain text files. They can have extensions like . txt or have no extension at all. BINARY files are programs or other non-text files saved in the file format of the application that created them or archived or compressed file formats.


1 Answers

Notepad does not save special symbols correctly. I had a similar problem and I used Notepad++ instead and selected UTf-8 encoding from there. When I did this, my program no longer crashed when applying String library methods to it unlike when I created the text file in Notepad.

like image 140
user929404 Avatar answered Nov 03 '22 19:11

user929404