How to append selected columns to pandas dataframe from df with different columns

Tags:

I want to be able to append df1 df2, df3 into one df_All , but since each of the dataframe has different column. How could I do this in for loop ( I have others stuff that i have to do in the for loop ) ?

import pandas as pd
import numpy as np

df1 = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6])])
df2 = pd.DataFrame.from_items([('B', [5, 6, 7]), ('A', [8, 9, 10])])
df3 = pd.DataFrame.from_items([('C', [5, 6, 7]), ('D', [8, 9, 10]), ('A',[1,2,3]), ('B',[4,5,7])])
list = ['df1','df2','df3']
df_All = pd.DataFrame()
for i in list:
   # doing something else as well --- 
    df_All = df_All.append(i)

enter image description here

I want my df_All to only have ( A & B ) only, is there a way to this in loop above ? something like append only this two columns ?

212

asked Mar 29 '15 22:03

JPC

2 Answers

If I understand what you want then you need to select just columns 'A' and 'B' from df3 and then use pd.concat :

In [35]:

df1 = pd.DataFrame.from_items([('A', [1, 2, 3]), ('B', [4, 5, 6])])
df2 = pd.DataFrame.from_items([('B', [5, 6, 7]), ('A', [8, 9, 10])])
df3 = pd.DataFrame.from_items([('C', [5, 6, 7]), ('D', [8, 9, 10]), ('A',[1,2,3]), ('B',[4,5,7])])
df_list = [df1,df2,df3[['A','B']]]
pd.concat(df_list, ignore_index=True)
Out[35]:
    A  B
0   1  4
1   2  5
2   3  6
3   8  5
4   9  6
5  10  7
6   1  4
7   2  5
8   3  7

Note that in your original code this is poor practice:

list = ['df1','df2','df3']

This shadows the built in type list plus even if it was actually a valid var name like df_list you've created a list of strings and not a list of dfs.

If you want to determine the common columns then you can determine this using the np.intersection method on the columns:

In [39]:

common_cols = df1.columns.intersection(df2.columns).intersection(df3.columns)
common_cols
Out[39]:
Index(['A', 'B'], dtype='object')

178

answered Sep 23 '22 05:09

EdChum

You can also use set comprehension to join all common columns from an arbitrary list of DataFrames:

df_list = [df1, df2, df3]
common_cols = list(set.intersection(*(set(c) for c in df_list)))
df_new = pd.concat([df[common_cols] for df in df_list], ignore_index=True)
>>> df_new 
    A  B
0   1  4
1   2  5
2   3  6
3   8  5
4   9  6
5  10  7
6   1  4
7   2  5
8   3  7

answered Sep 20 '22 05:09

Alexander

Related questions
                            
                                pandas dataframe with 2-rows header and export to csv
                            
                                How to run python setup.py develop command inside virtualenv using ansible
                            
                                Rename "None" value in Pandas
                            
                                Python ctypes definition for c struct
                            
                                Listing all combinations of a list up to length n (Python)
                            
                                ImportError: cannot import name choice when importing sklearn.mixture
                            
                                How to plot blurred points in Matplotlib
                            
                                Selenium Python bindings: how to execute JavaScript on an element?
                            
                                SQLAlchemy: How to Delete with join
                            
                                Set writeConcern level to unacknowledged in pymongo
                            
                                How to open this XML file to create dataframe in Python?
                            
                                How does one ignore CSRF tokens sent to Django REST Framework?
                            
                                Python logging module emits wrong timezone information
                            
                                Calculating year over year growth by group in Pandas
                            
                                Get file path from askopenfilename function in Tkinter
                            
                                How to understand closure in a lambda?
                            
                                How do I create an "OR" filter using elasticsearch-dsl-py?
                            
                                Managing Celery Task Results
                            
                                Buildozer failed to execute the last command
                            
                                Get start and stop from a python slice object

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to append selected columns to pandas dataframe from df with different columns

Tags:

python

pandas

dataframe

JPC

People also ask

2 Answers

EdChum

Alexander

Recent Activity

Donate For Us