Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Append a list of arrays as column to pandas Data Frame with same column indices

I have a list of arrays (one-dimensional numpy array) (a_) and a list (l_) and want to have a DataFrame with them as its columns. They look like this:

a_: [array([381]), array([376]), array([402]), array([400])...]
l_: [1.5,2.34,4.22,...]

I can do it by:

df_l = pd.DataFrame(l_)
df_a = pd.DataFrame(a_)
df = pd.concat([df_l, df_a], axis=1)

Is there a shorter way of doing it? I tried to use pd.append:

df_l = pd.DataFrame(l_)
df_l = df_l.append(a_)

However, because columns indices are both 0, it adds a_ to the end of the dataframe column, resulting in a single column. Is there something like this:

l_ = l_.append(a_).reset(columns)

that set a new column index for the appended array? well, obviously this does not work!

the desired output is like:

  0       0
0 1.50    381
1 2.34    376
2 4.22    402 

...

Thanks.

like image 525
PyLearner Avatar asked Mar 16 '15 23:03

PyLearner


People also ask

How do you append a list of values to a DataFrame?

Using loc[] to Append The New List to a DataFrame. By using df. loc[index]=list you can append a list as a row to the DataFrame at a specified Index, In order to add at the end get the index of the last record using len(df) function.

Can you append an array to a DataFrame?

We can also append a Numpy array to the dataframe, but we need to convert it into a dataframe first. We are concatenating data1 and data3 along the 0 axis. It means we are appending rows, not columns. As we can the, we have successfully added rows using the concat function.


1 Answers

Suggestion:

df_l = pd.DataFrame(l_) 
df_1['a_'] = pd.Series(a_list, index=df_1.index)

Example #1:

L = list(data)
A = list(data)
data_frame = pd.DataFrame(L) 
data_frame['A'] = pd.Series(A, index=data_frame.index)

Example #2 - Same Series length (create series and set index to the same as existing data frame):

In [33]: L = list(item for item in range(10))

In [34]: A = list(item for item in range(10,20))

In [35]: data_frame = pd.DataFrame(L,columns=['L'])

In [36]: data_frame['A'] = pd.Series(A, index=data_frame.index)

In [37]: print data_frame

   L   A
0  0  10
1  1  11
2  2  12
3  3  13
4  4  14
5  5  15
6  6  16
7  7  17
8  8  18
9  9  19

Example #3 - Different Series lengths (create series and let pandas handle index matching):

In [45]: not_same_length = list(item for item in range(50,55))

In [46]: data_frame['nsl'] = pd.Series(not_same_length)

In [47]: print data_frame

   L   A  nsl
0  0  10   50
1  1  11   51
2  2  12   52
3  3  13   53
4  4  14   54
5  5  15  NaN
6  6  16  NaN
7  7  17  NaN
8  8  18  NaN
9  9  19  NaN

Based on your comments, it looks like you want to join your list of lists.I'm assuming they are in list structure because array() is not a method in python. To do that you would do the following:

In [63]: A = [[381],[376], [402], [400]]

In [64]: A = [inner_item for item in A for inner_item in item]

In [65]: print A

[381, 376, 402, 400]

Then create the Series using the new array and follow the steps above to add to your data frame.

like image 145
kennes Avatar answered Oct 11 '22 12:10

kennes