I would like to use an Attribute-Relation File Format with scikit-learn to do some NLP task, is this possible? How can use an .arff
file with scikit-learn
?
The liac-arff module implements functions to read and write ARFF files in Python.
You need a suitable software like Weka to open an ARFF file. Without proper software you will receive a Windows message "How do you want to open this file?" or "Windows cannot open this file" or a similar Mac/iPhone/Android alert. If you cannot open your ARFF file correctly, try to right-click or long-press the file.
Solution with scipy.arff
Code:
from scipy.io import arff
import pandas as pd
data = arff.loadarff('file.arff')
df = pd.DataFrame(data[0])
df.head()
I really recommend liac-arff. It doesn't load directly to numpy, but the conversion is simple:
import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With