Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to insert dataframe to a data frame in Pandas

Tags:

python

pandas

I have the two data frames:

import pandas as pd
rep1 = pd.DataFrame.from_items([('Probe', ['x', 'y', 'z']), ('Gene', ['foo', 'bar', 'qux']), ('RP1',[1.00,23.22,11.12]),('RP1.pacall',["A","B","C"])   ], orient='columns')
pg   = rep1[["Probe","Gene"]]

Which produces:

In [105]: rep1
Out[105]:
  Probe Gene    RP1 RP1.pacall
0     x  foo   1.00          A
1     y  bar  23.22          B
2     z  qux  11.12          C
In [107]: pg
Out[107]:
  Probe Gene
0     x  foo
1     y  bar
2     z  qux

What I want to do then is to insert pg into rep1, resulting in:

    Probe Gene    RP1 Probe  Gene RP1.pacall
0     x  foo   1.00   x    foo     G
1     y  bar  23.22   y    bar     I
2     z  qux  18.12   z    qux     K

I tried this but fail:

In [101]: rep1.insert(1,["Probe","Gene"],pg)
TypeError: unhashable type: 'list'

What's the right way to do it?

like image 287
pdubois Avatar asked Feb 04 '15 10:02

pdubois


People also ask

How do you insert a data frame into a DataFrame?

append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value.

How do I import a DataFrame in pandas?

Using the read_csv() function from the pandas package, you can import tabular data from CSV files into pandas dataframe by specifying a parameter value for the file name (e.g. pd. read_csv("filename. csv") ). Remember that you gave pandas an alias ( pd ), so you will use pd to call pandas functions.

How do you add a value to a column in a data frame?

You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])


1 Answers

Call concat and pass param axis = 1 to concatenate column-wise:

In [72]:

pd.concat([rep1,pg], axis=1)
Out[72]:
  Probe Gene    RP1 RP1.pacall Probe Gene
0     x  foo   1.00          A     x  foo
1     y  bar  23.22          B     y  bar
2     z  qux  11.12          C     z  qux

Note that doing the above will result in some slightly odd but correct behaviour:

In [73]:

merged = pd.concat([rep1,pg], axis=1)
merged['Probe']
Out[73]:
  Probe Probe
0     x     x
1     y     y
2     z     z

To achieve your specific column ordering you'd have to slice the original df columns and select a subset of them (note the use of double [[]]):

In [76]:

pd.concat([rep1[['Probe','Gene','RP1']], pg, rep1[['RP1.pacall']]], axis=1)
Out[76]:
  Probe Gene    RP1 Probe Gene RP1.pacall
0     x  foo   1.00     x  foo          A
1     y  bar  23.22     y  bar          B
2     z  qux  11.12     z  qux          C

there is no insert point as such with concat, merge or join

like image 100
EdChum Avatar answered Nov 01 '22 11:11

EdChum