Please have a look at the following code
import java.io.*; public class CSVConverter { private File csvFile; private BufferedReader reader; private StringBuffer strBuffer; private BufferedWriter writer; int startNumber = 0; private String strString[]; public CSVConverter(String location, int startNumber) { csvFile = new File(location); strBuffer = new StringBuffer(""); this.startNumber = startNumber; //Read try { reader = new BufferedReader(new FileReader(csvFile)); String line = ""; while((line=reader.readLine())!=null) { String[] array = line.split(","); String inputQuery = "insertQuery["+startNumber+"] = \"insert into WordList_Table ('Engl','Port','EnglishH','PortugueseH','Numbe','NumberOf','NumberOfTime','NumberOfTimesPor')values('"+array[0]+"','"+array[2]+"','"+array[1]+"','"+array[3]+"',0,0,0,0)\""; strBuffer.append(inputQuery+";"+"\r\n"); startNumber++; } } catch(Exception e) { e.printStackTrace(); } System.out.println(strBuffer.toString()); //Write try { File file = new File("C:/Users/list.txt"); FileWriter filewrite = new FileWriter(file); if(!file.exists()) { file.createNewFile(); } writer = new BufferedWriter(filewrite); writer.write(strBuffer.toString()); writer.flush(); writer.close(); } catch(Exception e) { e.printStackTrace(); } } public static void main(String[]args) { new CSVConverter("C:/Users/list.csv",90); } }
I am trying to read a CSV file, edit the text in code, and write it back to a .txt
file. My issue is, I have Portuguese words, so the file should be read and write using ANSI
format. Right now some Portuguese words are replaced with symbols in the output file.
How can I read and write text data into a file in ANSI format in Java?
ANSI (aka Windows-1252/WinLatin1) is a character encoding of the Latin alphabet, fairly similar to ISO-8859-1.
ANSI Format was developed by The American National Standards Institute and presents itself as a Microsoft-related standard for character set encoding. Also ANSI Format serves as a modified ASCII (the American Standard Code for Information Interchange) character set.
In comparison, UTF-8 is more flexible as it is a multibyte encoding scheme; depending on the needs of the user, anywhere between 1 to 6 bytes can be used to represent a character. Because ANSI only uses one byte or 8 bits, it can only represent a maximum of 256 characters.
To read a text file with a specific encoding you can use a FileInputStream
in conjunction with a InputStreamReader
. The right Java encoding for Windows ANSI is Cp1252
.
reader = new BufferedReader(new InputStreamReader(new FileInputStream(csvFile), "Cp1252"));
To write a text file with a specific character encoding you can use a FileOutputStream
together with a OutputStreamWriter
.
writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file), "Cp1252"));
The classes InputStreamReader
and OutputStreamWriter
translate between byte oriented streams and text with a specific character encoding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With