With performing some classificion using some user/item/rating data. My issue is how to I convert these 3 columns into a matrix of user(row), item(columns) and the ratings data populating the matrix.
User Item ItemRating
1 23 3
2 204 4
1 492 2
3 23 4
and so on. I tried using DataFrame but was getting NULL errors.
To create a NumPy array, you can use the function np. array() . All you need to do to create a simple array is pass a list to it. If you choose to, you can also specify the type of data in your list.
NumPy (short for Numerical Python) provides an efficient interface to store and operate on dense data buffers. In some ways, NumPy arrays are like Python's built-in list type, but NumPy arrays provide much more efficient storage and data operations as the arrays grow larger in size.
If I have to create a 2D array of 1s or 0s, I can use numpy. ones() or numpy. zeros() respectively.
This is pivot, if I get your idea right, with pandas it will be as follows.
Load data:
import pandas as pd
df = pd.read_csv(fname, sep='\s+', header=None)
df.columns = ['User','Item','ItemRating']
Pivot it:
>>> df
User Item ItemRating
0 1 23 3
1 2 204 4
2 1 492 2
3 3 23 4
>>> df.pivot(index='User', columns='Item', values='ItemRating')
Item 23 204 492
User
1 3 NaN 2
2 NaN 4 NaN
3 4 NaN NaN
For a numpy example, let's emulate file with StringIO
:
from StringIO import StringIO
data ="""1 23 3
2 204 4
1 492 2
3 23 4"""
and load it:
>>> arr = np.genfromtxt(StringIO(data), dtype=int)
>>> arr
array([[ 1, 23, 3],
[ 2, 204, 4],
[ 1, 492, 2],
[ 3, 23, 4]])
pivot is based on this answer
rows, row_pos = np.unique(arr[:, 0], return_inverse=True)
cols, col_pos = np.unique(arr[:, 1], return_inverse=True)
rows, row_pos = np.unique(arr[:, 0], return_inverse=True)
cols, col_pos = np.unique(arr[:, 1], return_inverse=True)
pivot_table = np.zeros((len(rows), len(cols)), dtype=arr.dtype)
pivot_table[row_pos, col_pos] = arr[:, 2]
and the result:
>>> pivot_table
array([[ 3, 0, 2],
[ 0, 4, 0],
[ 4, 0, 0]])
Note that results differ, as in second approach non-existing values are set to zero.
Select one that suits you better ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With