Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Weka Java Code - How Convert CSV (without header row) to ARFF Format?

Tags:

java

csv

weka

arff

I'm using the Weka Java library to read in a CSV file and convert it to an ARFF file.

The problem is that the CSV file doesn't have a header row, only data. How do I assign attribute names after I bring in the CSV file? (all the columns would be string data types)

Here is the code I have so far:

    CSVLoader loader = new CSVLoader();
    loader.setSource(new File(CSVFilePath));
    Instances data = loader.getDataSet();

    ArffSaver saver = new ArffSaver();
    saver.setInstances(data);
    saver.setFile(new File(outputFilePath));
    saver.writeBatch();

I tried looking through the Weka source code to figure this out but I couldn't make heads or tails of it :-(

like image 905
Greg Avatar asked Aug 18 '10 22:08

Greg


People also ask

How do I convert a CSV file to ARFF WEKA?

Converting csv to arff - Weka Wiki. For converting CSV (comma separated value) files into ARFF files you need the following two converters: CSVLoader for loading the CSV file into an Instances object. ArffSaver to save the Instances as an ARFF file.

How do you save data in ARFF format?

Save your dataset in ARFF format by clicking the “File” menu and selecting “Save as…”. Enter a filename with a . arff extension and click the “Save” button.


2 Answers

The short answer is, you can't assign attribute names after you read in the file.

CSVLoader assumes the first line of the CSV is the header. If that's an instance, it will use that instance data as the header row and not as instance data, which is definitely not what you want.

Before the code above, you need to read the file in, write a header row, and save the file again.

See my answer to your question on the weka mailing list.

like image 195
michaeltwofish Avatar answered Sep 30 '22 00:09

michaeltwofish


You can use the option -H if you have no header row present in the data.

CSVLoader loader = new CSVLoader();
loader.setSource(new File(CSVFilePath));

String[] options = new String[1]; 
options[0] = "-H";
loader.setOptions(options);

Instances data = loader.getDataSet();

see: http://weka.sourceforge.net/doc.dev/weka/core/converters/CSVLoader.html

like image 44
maledr53 Avatar answered Sep 30 '22 00:09

maledr53