Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

arrays into pandas dataframe columns

I have a program that outputs arrays.

For example:

[[0, 1, 0], [0, 0, 0], [1, 3, 3], [2, 4, 4]]

I would like to turn these arrays into a dataframe using pandas. However, when I do the values become row values like this:

img

As you can see each array within the overall array becomes its own row. I would like each array within the overall array to become its own column with a column name.

Furthermore, in my use case, the number of arrays within the array is variable. There could be 4 arrays or 70 which means there could be 4 columns or 70. This is problematic when it comes to column names and I was wondering if there was anyway to auto increment column names in python.

Check out my attempt below and let me know how I can solve this.

My desired outcome is simply to make each array within the overall array into its own column instead of row and to have titles for the column that increment with each additional array/column.

Thank you so much.

Need help. Please respond!

frame = [[0, 1, 0], [0, 0, 0], [1, 3, 3], [2, 4, 4]]
numpy_data= np.array(frame)

df = pd.DataFrame(data=numpy_data, columns=["column1", "column2", "column3"])
print(frame)
print(df)
like image 451
Juliette Avatar asked Sep 26 '20 20:09

Juliette


People also ask

Can we create DataFrame from array?

Since a DataFrame is similar to a 2D Numpy array, we can create one from a Numpy ndarray . You should remember that the input Numpy array must be 2D, otherwise you will get a ValueError. If you pass a raw Numpy ndarray , the index and column names start at 0 by default.

Can array be used to create a DataFrame in pandas?

Numpy arrays Since a dataframe can be considered as a two-dimensional data structure, we can use a two-dimensional numpy array to create a dataframe. A is a two-dimensional array with 4 rows and 3 columns. We can pass it to the DataFrame function. Pandas assigns integer index for columns by default.

Can you convert NumPy array to DataFrame?

You can convert NumPy array to pandas dataframe using the dataframe constructor pd. DataFrame(array) . Use the below snippet to create a pandas dataframe from the NumPy array. When you print the dataframe using df , you'll see the array is converted as a dataframe.


2 Answers

A possible solution could be transposing and renaming the columns after transforming the numpy array into a dataframe. Here is the code:

import numpy as np
import pandas as pd

frame = [[0, 1, 0], [0, 0, 0], [1, 3, 3], [2, 4, 4]]
numpy_data= np.array(frame)

#transposing later
df = pd.DataFrame(data=numpy_data).T 

#creating a list of columns using list comprehension without specifying number of columns
df.columns = [f'mycol{i}' for i in range(0,len(df.T))] 

print(df)

Output:

   mycol0  mycol1  mycol2  mycol3
0       0       0       1       2
1       1       0       3       4
2       0       0       3       4

Same code for 11 columns:

import numpy as np
import pandas as pd

frame = [[0, 1, 0], [0, 0, 0], [1, 3, 3], [2, 4, 4], [5, 2, 2], [6,7,8], [8,9,19] , [10,2,4], [2,6,5], [10,2,5], [11,2,9]]
numpy_data= np.array(frame)

df = pd.DataFrame(data=numpy_data).T
df.columns = [f'mycol{i}' for i in range(0,len(df.T))]

print(df)
   mycol0  mycol1  mycol2  mycol3  mycol4  mycol5  mycol6  mycol7  mycol8  mycol9  mycol10
0       0       0       1       2       5       6       8      10       2      10       11
1       1       0       3       4       2       7       9       2       6       2        2
2       0       0       3       4       2       8      19       4       5       5        9
like image 192
Grayrigel Avatar answered Oct 24 '22 19:10

Grayrigel


You can transpose the array and add_prefix

frame = [[0, 1, 0], [0, 0, 0], [1, 3, 3], [2, 4, 4]]

pd.DataFrame(np.array(frame).T).add_prefix('column')

Out:

   column0  column1  column2  column3
0        0        0        1        2
1        1        0        3        4
2        0        0        3        4

Works with every number of arrays

frame = [[0, 1, 0], [0, 0, 0], [1, 3, 3], [2, 4, 4], [1,0,1], [2,0,3]]

pd.DataFrame(np.array(frame).T).add_prefix('column')

Out:

   column0  column1  column2  column3  column4  column5
0        0        0        1        2        1        2
1        1        0        3        4        0        0
2        0        0        3        4        1        3
like image 40
Michael Szczesny Avatar answered Oct 24 '22 17:10

Michael Szczesny