I want to append (merge) all the csv files in a folder using Python pandas.
For example: Say folder has two csv files test1.csv and test2.csv as follows:
A_Id    P_Id    CN1         CN2         CN3 AAA     111     702         709         740 BBB     222     1727        1734        1778   and
A_Id    P_Id    CN1         CN2         CN3 CCC     333     710        750          750 DDD     444     180        734          778   So the python script I wrote was as follows:
#!/usr/bin/python import pandas as pd import glob  all_data = pd.DataFrame() for f in glob.glob("testfolder/*.csv"):     df = pd.read_csv(f)     all_data = all_data.append(df)  all_data.to_csv('testfolder/combined.csv')   Though the combined.csv seems to have all the appended rows, it looks as follows:  
      CN1       CN2         CN3    A_Id    P_Id   0   710      750         750     CCC     333   1   180       734         778     DDD     444        0   702       709         740     AAA     111   1  1727       1734        1778    BBB     222   Where as it should look like this:
A_ID   P_Id   CN1    CN2    CN2 AAA    111    702    709    740 BBB    222    1727   1734   1778 CCC    333    110    356    123 DDD    444    220    256    223   What am I missing? And how can I get get of 0s and 1s in the first column?
P.S: Since these are large csv files, I thought of using pandas.
Pandas. DataFrame doesn't preserve the column order when converting from a DataFrames.
Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.
I read somewhere else Dataframes do not guarantee line order. My experience is that the order of the CSV will be maintained when read. If you do a transform on the dataframe, the order can be lost. Dataframes do have sort support, if you are not sure.
Answer. Yes. Order of the merged dataframes will effect the order of the rows and columns of the merged dataframe. When using the merge() method, it will preserve the order of the left keys.
Try this .....
all_data = all_data.append(df)[df.columns.tolist()] 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With