Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arff Loader : AttributeError: 'dict' object has no attribute 'data'

I am trying to load a .arff file into a numpy array using liac-arff library. (https://github.com/renatopp/liac-arff)

This is my code.

import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset.data)

when executing, I am getting the error.

ArffLoader.py", line 8, in <module>
data = np.array(dataset.data)
AttributeError: 'dict' object has no attribute 'data'

I have seen similar threads, Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'. I am new to Python and is not able to resolve this issue. How can I fix this?

like image 416
Erdnase Avatar asked Mar 10 '15 14:03

Erdnase


2 Answers

Short version

dataset is a dict. For a dict, you access the values using the python indexing notation, dataset[key], where key could be a string, integer, float, tuple, or any other immutable data type (it is a bit more complicated than that, more below if you are interested).

In your case, the key is in the form of a string. To access it, you need to give the string you want as an index, like so:

import arff
import numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])

(you also shouldn't put the imports on the same line, although this is just a readability issue)

More detailed explanation

dataset is a dict, which on some languages is called a map or hashtable. In a dict, you access values in a similar way to how you index in a list or array, except the "index" can be any data-type that is "hashable" (which is, ideally, unique identifier for each possible value). This "index" is called a "key". In practice, at least for built-in types and most major packages, only immutable data types or hashable, but there is no actual rule that requires this to be the case.

Do you come from MATLAB? If so, then you are probably trying to use MATLAB's struct access technique. You could think of a dict as a much faster, more flexible struct, but syntax for accessing values are is different.

like image 172
TheBlackCat Avatar answered Oct 05 '22 22:10

TheBlackCat


Its easy to load arff data into python using scipy.

from scipy.io import arff

import pandas as pd

data = arff.loadarff('dataset.arff')

df = pd.DataFrame(data[0])

df.head()
like image 30
Thirumal Alagu Avatar answered Oct 05 '22 22:10

Thirumal Alagu