Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.arff files with scikit-learn?

I would like to use an Attribute-Relation File Format with scikit-learn to do some NLP task, is this possible? How can use an .arff file with scikit-learn?

like image 718
tumbleweed Avatar asked Dec 03 '14 05:12

tumbleweed


People also ask

Can we read ARFF file in Python?

The liac-arff module implements functions to read and write ARFF files in Python.

How do I open an .arff file?

You need a suitable software like Weka to open an ARFF file. Without proper software you will receive a Windows message "How do you want to open this file?" or "Windows cannot open this file" or a similar Mac/iPhone/Android alert. If you cannot open your ARFF file correctly, try to right-click or long-press the file.


2 Answers

Solution with scipy.arff

Code:


from scipy.io import arff
import pandas as pd

data = arff.loadarff('file.arff')
df = pd.DataFrame(data[0])
df.head()
like image 102
Hissaan Ali Avatar answered Sep 28 '22 02:09

Hissaan Ali


I really recommend liac-arff. It doesn't load directly to numpy, but the conversion is simple:

import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])
like image 43
renatopp Avatar answered Sep 28 '22 03:09

renatopp