Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read a CSV file in UTF-8 format

Tags:

java

csv

utf-8

I am reading a csv file in java, adding a new column with new information and exporting it back to a CSV file. I have a problem in reading the CSV file in UTF-8 format. I read line by line and store it in a StringBuilder, but when I print the line I can see that the information I'm reading is not in UTF-8 but in ANSI. I used both System.out.print and printstream in UTF and the information appears still in ANSI. This is my code :

    BufferedReader br;
    try {
        br = new BufferedReader(new InputStreamReader(new FileInputStream(
                "./users.csv"), "UTF8"));
        String line;
        while ((line = br.readLine()) != null) {
            if (line.contains("[email protected]")) {
                continue;
            }
            if (!line.contains("@") && !line.contains("FirstName")) {
                continue;
            }
            PrintStream ps = new PrintStream(System.out, true, "UTF-8");
            ps.print(line + "\n");
            sbusers.append(line);
            sbusers.append("\n");
            sbusers2.append(line);
            sbusers2.append(",");
        }
        br.close();
    } catch (IOException e) {
        System.out.println("Failed to read users file.");
    } finally {
    }

It prints out information like "Professor -P�s". Since the reading isn't being done correctly the output to the new file is also being exported in ANSI.

like image 980
Ricardo Avatar asked Sep 30 '13 17:09

Ricardo


People also ask

How do I convert a CSV file to UTF-8?

Click on open and select the file from the computer that you want to save as a UTF-8 encoded file. After opening the file go to File > Save as. From the dialog box that opens, type the name of the file and select Text CSV (.csv) from the Save as type drop-down.

How to export apple numbers to CSV with UTF-8 encoding?

The steps are as given below: Open the file with Apple Numbers. Navigate to File > Export To > CSV. Under Advanced Options, select Unicode(UTF-8) option for Text Encoding. Click Next. Enter the name of the file and click Export to save your file with the UTF-8 encoding.

How to save a file as UTF-8 encoded file?

Click on open and select the file from the computer that you want to save as a UTF-8 encoded file. Step 2 – After opening the file go to File > Save as. From the dialog box that opens, type the name of the file and select Text CSV (.csv) from the Save as type drop-down. Check the Edit filter settings options. Click on Save.

How to save a CSV file as UTF-8 encoded in LibreOffice?

Step 1 – To save a CSV file as UTF-8 encoded, follow the steps below: Open LibreOffice and go to Files from the menubar. Click on open and select the file from the computer that you want to save as a UTF-8 encoded file. Step 2 – After opening the file go to File > Save as.


2 Answers

Are you sure your CSV is UTF-8 encoded? My guess is that it's not. Try using ISO-8859-1 for reading the file, but keep the output as UTF-8. (UTF8 and UTF-8 both tend to work, but you should use UTF-8 as @Marcelo suggested)

like image 132
Sam Barnum Avatar answered Nov 04 '22 08:11

Sam Barnum


In the line:

br = new BufferedReader(new InputStreamReader(new FileInputStream("./users.csv"),"UTF8"));

Your charset should be "UTF-8" not "UTF8".

like image 43
Marcelo Avatar answered Nov 04 '22 08:11

Marcelo