I notice that this is an issue on GitHub already. Does anyone have any code that converts a Pandas DataFrame to an Orange Table?
Explicitly, I have the following table.
user hotel star_rating user home_continent gender
0 1 39 4.0 1 2 female
1 1 44 3.0 1 2 female
2 2 63 4.5 2 3 female
3 2 2 2.0 2 3 female
4 3 26 4.0 3 1 male
5 3 37 5.0 3 1 male
6 3 63 4.5 3 1 male
In order to convert pandas DataFrame to Orange Table you need to construct a domain, which specifies the column types.
For continuous variables, you only need to provide the name of the variable, but for Discrete variables, you also need to provide a list of all possible values.
The following code will construct a domain for your DataFrame and convert it to an Orange Table:
import numpy as np
from Orange.feature import Discrete, Continuous
from Orange.data import Domain, Table
domain = Domain([
Discrete('user', values=[str(v) for v in np.unique(df.user)]),
Discrete('hotel', values=[str(v) for v in np.unique(df.hotel)]),
Continuous('star_rating'),
Discrete('user', values=[str(v) for v in np.unique(df.user)]),
Discrete('home_continent', values=[str(v) for v in np.unique(df.home_continent)]),
Discrete('gender', values=['male', 'female'])], False)
table = Table(domain, [map(str, row) for row in df.as_matrix()])
The map(str, row) step is needed so Orange know that the data contains values of discrete features (and not the indices of values in the values list).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With