Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge CSV files into a single file with no repeated headers

Tags:

java

csv

I have some CSV files with the same column headers. For example

File A

header1,header2,header3
one,two,three
four,five,six

File B

header1,header2,header3
seven,eight,nine
ten,eleven,twelve

I want to merge it so that the data is merged into one file with the headers at the top, but no headers anywhere else.

header1,header2,header3
one,two,three
four,five,six
seven,eight,nine
ten,eleven,twelve

What is a good way to achieve this?

like image 977
MxLDevs Avatar asked Aug 02 '13 15:08

MxLDevs


People also ask

Is it necessary to have header line in CSV?

CSV and spreadsheet content rules. Each row in the file must contain the same number of cells. This rule also applies to the header row. The first row must contain column headers.


2 Answers

This should work. It checks if the file being merged have matching headers. Would throw an exception otherwise. Exception handling (to close the streams etc.) has been left as an exercise.

String[] headers = null;
String firstFile = "/path/to/firstFile.dat";
Scanner scanner = new Scanner(new File(firstFile));

if (scanner.hasNextLine())
    headers[] = scanner.nextLine().split(",");

scanner.close();

Iterator<File> iterFiles = listOfFilesToBeMerged.iterator();
BufferedWriter writer = new BufferedWriter(new FileWriter(firstFile, true));

while (iterFiles.hasNext()) {
  File nextFile = iterFiles.next();
  BufferedReader reader = new BufferedReader(new FileReader(nextFile));

  String line = null;
  String[] firstLine = null;
  if ((line = reader.readLine()) != null)
    firstLine = line.split(",");

  if (!Arrays.equals (headers, firstLine))
    throw new FileMergeException("Header mis-match between CSV files: '" +
              firstFile + "' and '" + nextFile.getAbsolutePath());

  while ((line = reader.readLine()) != null) {
    writer.write(line);
    writer.newLine();
  }

  reader.close();
}
writer.close();
like image 174
Ravi K Thapliyal Avatar answered Oct 15 '22 01:10

Ravi K Thapliyal


Here is an example:

public static void main(String[] args) throws IOException {
    List<Path> paths = Arrays.asList(Paths.get("c:/temp/file1.csv"), Paths.get("c:/temp/file2.csv"));
    List<String> mergedLines = getMergedLines(paths);
    Path target = Paths.get("c:/temp/merged.csv");
    Files.write(target, mergedLines, Charset.forName("UTF-8"));
}

private static List<String> getMergedLines(List<Path> paths) throws IOException {
    List<String> mergedLines = new ArrayList<> ();
    for (Path p : paths){
        List<String> lines = Files.readAllLines(p, Charset.forName("UTF-8"));
        if (!lines.isEmpty()) {
            if (mergedLines.isEmpty()) {
                mergedLines.add(lines.get(0)); //add header only once
            }
            mergedLines.addAll(lines.subList(1, lines.size()));
        }
    }
    return mergedLines;
}
like image 24
assylias Avatar answered Oct 15 '22 00:10

assylias