Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get a 2d numpy array from a pandas dataframe? - wrong shape

I want to get a 2d-numpy array from a column of a pandas dataframe df having a numpy vector in each row. But if I do

df.values.shape

I get: (3,) instead of getting: (3,5)

(assuming that each numpy vector in the dataframe has 5 dimensions, and that the dataframe has 3 rows)

what is the correct method?

like image 869
Mostafa Avatar asked Dec 27 '14 12:12

Mostafa


People also ask

Why do we do reshape (- 1 1?

If you have an array of shape (2,4) then reshaping it with (-1, 1), then the array will get reshaped in such a way that the resulting array has only 1 column and this is only possible by having 8 rows, hence, (8,1).

How do I change the shape of a Pandas DataFrame?

melt() function is used to reshape a DataFrame from a wide to a long format. It is useful to get a DataFrame where one or more columns are identifier variables, and the other columns are unpivoted to the row axis leaving only two non-identifier columns named variable and value by default.


1 Answers

Ideally, avoid getting into this situation by finding a different way to define the DataFrame in the first place. However, if your DataFrame looks like this:

s = pd.Series([np.random.randint(20, size=(5,)) for i in range(3)])
df = pd.DataFrame(s, columns=['foo'])
#                   foo
# 0   [4, 14, 9, 16, 5]
# 1  [16, 16, 5, 4, 19]
# 2  [7, 10, 15, 13, 2]

then you could convert it to a DataFrame of shape (3,5) by calling pd.DataFrame on a list of arrays:

pd.DataFrame(df['foo'].tolist())
#     0   1   2   3   4
# 0   4  14   9  16   5
# 1  16  16   5   4  19
# 2   7  10  15  13   2

pd.DataFrame(df['foo'].tolist()).values.shape
# (3, 5)
like image 128
unutbu Avatar answered Nov 14 '22 23:11

unutbu