Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a pandas DataFrame with several numpy 1d arrays?

I've created some np.arrays to do some calculation with them. (All have the same size [100,1]) Now I want to create a pandas Dataframe and each array shoud be one column of that DF. The Names of the arrays should be the header of the DataFrame.

In Matlab I would easily do it like that:

Table = table(array1, array2, array3, ... );

How can I do this in Python?

thanks in advance!

like image 989
laurenz Avatar asked Jul 30 '17 12:07

laurenz


People also ask

How do you add multiple NumPy arrays in Python?

Use numpy. concatenate() to merge the content of two or multiple arrays into a single array. This function takes several arguments along with the NumPy arrays to concatenate and returns a Numpy array ndarray. Note that this method also takes axis as another argument, when not specified it defaults to 0.

How do you create a DataFrame from multiple lists in Python?

Create pandas DataFrame from Multiple ListsUse column param and index param to provide column & row labels respectively to the DataFrame. Alternatively, you can also add column names to DataFrame and set the index using pandas. DataFrame. set_index() method.

Can we create a DataFrame with multiple data types in Python?

You can create a DataFrame from multiple Series objects by adding each series as a columns. By using concat() method you can merge multiple series together into DataFrame. This takes several params, for our scenario we use list that takes series to combine and axis=1 to specify merge series as columns instead of rows.


2 Answers

Let's say these are your arrays:

arr1, arr2, arr3 = np.zeros((3, 100, 1))

arr1.shape
Out: (100, 1)

You can use hstack to stack them and pass the resulting 2D array to the DataFrame constructor:

df = pd.DataFrame(np.hstack((arr1, arr2, arr3)))

df.head()
Out: 
     0    1    2
0  0.0  0.0  0.0
1  0.0  0.0  0.0
2  0.0  0.0  0.0
3  0.0  0.0  0.0
4  0.0  0.0  0.0

Or name the columns as arr1, arr2, ...:

df = pd.DataFrame(np.hstack((arr1, arr2, arr3)), 
                  columns=['arr{}'.format(i+1) for i in range(3)])

which gives

df.head()
Out: 
   arr1  arr2  arr3
0   0.0   0.0   0.0
1   0.0   0.0   0.0
2   0.0   0.0   0.0
3   0.0   0.0   0.0
4   0.0   0.0   0.0
like image 146
ayhan Avatar answered Sep 23 '22 20:09

ayhan


Solution with numpy.concatenate for 2d array and DataFrame constructor:

df = pd.DataFrame(np.concatenate([arr1, arr2, arr3], axis=1), columns= ['a','b','c'])
like image 43
jezrael Avatar answered Sep 22 '22 20:09

jezrael