Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

csv parser reading headers

Tags:

java

parsing

csv

I'm working on a csv parser, I want to read headers and the rest of the csv file separately. Here is my code to read csv.

The current code reads everything in the csv file, but I need to read headers separate. please help me regarding this.

public class csv {

private void csvRead(File file)
{
    try
    {
    BufferedReader br = new BufferedReader( new FileReader(file));
    String strLine = "";
    StringTokenizer st = null;
    File cfile=new File("csv.txt");
    BufferedWriter writer = new BufferedWriter(new FileWriter(cfile));
    int tokenNumber = 0;

    while( (strLine = br.readLine()) != null)
    {
            st = new StringTokenizer(strLine, ",");
            while(st.hasMoreTokens())
            {

                    tokenNumber++;
                    writer.write(tokenNumber+"  "+ st.nextToken());
                    writer.newLine();
            }


            tokenNumber = 0;
            writer.flush();
    }
}

    catch(Exception e)
    {
        e.getMessage();
    }
}
like image 360
Avinash Avatar asked Jun 26 '12 15:06

Avinash


2 Answers

We have withHeader() method available in CSVFormat. If you use this option then you will be able to read the file using headers.

CSVFormat format = CSVFormat.newFormat(',').withHeader();
Map<String, Integer> headerMap = dataCSVParser.getHeaderMap(); 

will give you all headers.

public class CSVFileReaderEx {
    public static void main(String[] args){
        readFile();
    }

    public static void readFile(){
         List<Map<String, String>> csvInputList = new CopyOnWriteArrayList<>();
         List<Map<String, Integer>> headerList = new CopyOnWriteArrayList<>();

         String fileName = "C:/test.csv";
         CSVFormat format = CSVFormat.newFormat(',').withHeader();

          try (BufferedReader inputReader = new BufferedReader(new FileReader(new File(fileName)));
                  CSVParser dataCSVParser = new CSVParser(inputReader, format); ) {

             List<CSVRecord> csvRecords = dataCSVParser.getRecords();

             Map<String, Integer> headerMap = dataCSVParser.getHeaderMap();
              headerList.add(headerMap);
              headerList.forEach(System.out::println);

             for(CSVRecord record : csvRecords){
                 Map<String, String> inputMap = new LinkedHashMap<>();

                 for(Map.Entry<String, Integer> header : headerMap.entrySet()){
                     inputMap.put(header.getKey(), record.get(header.getValue()));
                 }

                 if (!inputMap.isEmpty()) {
                     csvInputList.add(inputMap);
                } 
             }

             csvInputList.forEach(System.out::println);

          } catch (Exception e) {
             System.out.println(e);
          }
    }
}
like image 90
Rajashree Gr Avatar answered Nov 15 '22 16:11

Rajashree Gr


Please consider the use of Commons CSV. This library is written according RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files. What is compatible to read such lines:

"aa,a","b""bb","ccc"

And the use is quite simple, there is just 3 classes, and a small sample according documentation:

Parsing of a csv-string having tabs as separators, '"' as an optional value encapsulator, and comments starting with '#':

 CSVFormat format = new CSVFormat('\t', '"', '#');
 Reader in = new StringReader("a\tb\nc\td");
 String[][] records = new CSVParser(in, format).getRecords();

And additionally you get this parsers already available as constants:

  • DEFAULT - Standard comma separated format as defined by RFC 4180.
  • EXCEL - Excel file format (using a comma as the value delimiter).
  • MYSQL - Default MySQL format used by the SELECT INTO OUTFILE and LOAD DATA INFILE operations. TDF - Tabulation delimited format.
like image 37
Francisco Spaeth Avatar answered Nov 15 '22 15:11

Francisco Spaeth