Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas equivalent rbind operation

Tags:

python

pandas

Basically, I am looping through a bunch of CSV files and in the end would like to append each dataframe into one. Actually, all I need is an rbind type function. So, I did some search and followed the guide. However, I still could not get the ideal solution.

A sample code is attached below. For instance shape of data1 is always 47 by 42. But shape of data_out_final becomes (47, 42), (47, 84), and (47, 126) after the first three files. Idealy, it should be (141, 42). In addition, I check index of data1, which is RangeIndex(start=0, stop=47, step=1). Appreciate any suggestions!

My pandas version is 0.18.1

code

appended_data = []
for csv_each in csv_pool:
    data1 = pd.read_csv(csv_each, header=0)
    # do something here
    appended_data.append(data2) 
data_out_final = pd.concat(appended_data, axis=1)

If using data_out_final = pd.concat(appended_data, axis=1), shape of data_out_final becomes (141, 94)

PS

kind of figure it out. Actually, you have to standardize column names before pd.concat.

like image 411
TTT Avatar asked Aug 08 '16 20:08

TTT


People also ask

Is there a Rbind in Python?

Method 1: Use rbind() function with equal columns This will combine the rows based on columns. Example: Python3.

How do you Rbind two data frames?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.

Can we perform Crossjoin in DataFrame?

In Pandas, there are parameters to perform left, right, inner or outer merge and join on two DataFrames or Series. However there's no possibility as of now to perform a cross join to merge or join two methods using how="cross" parameter.

How do you concatenate rows in Python?

Use pandas.concat() method to concat two DataFrames by rows meaning appending two DataFrames. By default, it performs append operations similar to a union where it bright all rows from both DataFrames to a single DataFrame.


2 Answers

>>> df1
          a         b
0 -1.417866 -0.828749
1  0.212349  0.791048
2 -0.451170  0.628584
3  0.612671 -0.995330
4  0.078460 -0.322976
5  1.244803  1.576373
6  1.169629 -1.135926
7 -0.652443  0.506388
8  0.549604 -0.691054
9 -0.512829 -0.959398

>>> df2
          a         b
0 -0.652161  0.940932
1  2.495067  0.004833
2 -2.187792  1.692402
3  1.900738  0.372425
4  0.245976  1.894527
5  0.627297  0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8  0.543389  0.703457
9 -0.755059  1.239968

>>> pd.concat([df1, df2])
          a         b
0 -1.417866 -0.828749
1  0.212349  0.791048
2 -0.451170  0.628584
3  0.612671 -0.995330
4  0.078460 -0.322976
5  1.244803  1.576373
6  1.169629 -1.135926
7 -0.652443  0.506388
8  0.549604 -0.691054
9 -0.512829 -0.959398
0 -0.652161  0.940932
1  2.495067  0.004833
2 -2.187792  1.692402
3  1.900738  0.372425
4  0.245976  1.894527
5  0.627297  0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8  0.543389  0.703457
9 -0.755059  1.239968

Unless I'm misinterpreting what you need, this is what you need.

like image 160
Asish M. Avatar answered Sep 19 '22 07:09

Asish M.


Try: http://pandas.pydata.org/pandas-docs/stable/10min.html?highlight=concat#concat

"pandas provides various facilities for easily combining together Series, DataFrame, and Panel objects with various kinds of set logic for the indexes and relational algebra functionality in the case of join / merge-type operations."

like image 24
Jon Avatar answered Sep 18 '22 07:09

Jon