Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas append same series to each column

Tags:

python

pandas

Consider the dataframe df

df = pd.DataFrame(np.random.rand(5, 3),
                  ['p0', 'p1', 'p2', 'p3', 'p4'],
                  ['A', 'B', 'C'])
df

df

And the mean of every row:

dm = df.mean(1)
dm.index = ['m0', 'm1', 'm2', 'm3', 'm4']
dm

m0    0.352396
m1    0.606469
m2    0.643022
m3    0.560809
m4    0.776058
dtype: float64

How do I append this series of means to every column of df. I expect the results to look like:

enter image description here

Also, since this will be applied at scale, time is of the essence.

What I used to generate the expected output is:

pd.concat([df, pd.DataFrame({c: dm for c, i in df.iteritems()})])

Timing (small scale)

enter image description here

Timing (large scale)

enter image description here

like image 627
piRSquared Avatar asked Jul 03 '16 07:07

piRSquared


People also ask

How to append a series to an existing series in pandas?

Please note that we can only append a series or list/tuple of series to the existing series. Step1: Define a Pandas series, s1. Step 2: Define another series, s2. Step 3: Append s2 to s1. Step 4: Print the final appended series.

How to concatenate two or more series objects in pandas?

The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Pandas Series.append () function is used to concatenate two or more series object. Syntax: Series.append (to_append, ignore_index=False, verify_integrity=False)

What is a pandas series in Python?

Python | Pandas Series.append() Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index.

Why do we use pandas in Python for data analysis?

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Python Series.add () is used to add series or list like objects with same length to the caller series.


1 Answers

You can use double concat:

print (pd.concat([dm] * df.shape[1], axis=1, keys=df.columns))
           A         B         C
m0  0.823788  0.823788  0.823788
m1  0.615354  0.615354  0.615354
m2  0.606740  0.606740  0.606740
m3  0.386629  0.386629  0.386629
m4  0.637147  0.637147  0.637147

print (pd.concat([df, pd.concat([dm] * df.shape[1], axis=1, keys=df.columns)]))
           A         B         C
p0  0.789966  0.699837  0.981560
p1  0.415609  0.469310  0.961144
p2  0.920938  0.476615  0.422665
p3  0.323782  0.805231  0.030874
p4  0.761674  0.361134  0.788632
m0  0.823788  0.823788  0.823788
m1  0.615354  0.615354  0.615354
m2  0.606740  0.606740  0.606740
m3  0.386629  0.386629  0.386629
m4  0.637147  0.637147  0.637147

For creating appended df is possible use numpy repeat and numpy.newaxis:

x = dm.values

print (pd.DataFrame(np.repeat(x[:, np.newaxis], df.shape[1], 1), 
                    columns=df.columns,
                    index=dm.index))
           A         B         C
m0  0.399837  0.399837  0.399837
m1  0.890191  0.890191  0.890191
m2  0.580747  0.580747  0.580747
m3  0.354032  0.354032  0.354032
m4  0.329108  0.329108  0.329108

print(pd.concat([df, pd.DataFrame(np.repeat(x[:, np.newaxis], df.shape[1], 1), 
                    columns=df.columns,
                    index=dm.index)]))

           A         B         C
p0  0.087337  0.375891  0.736282
p1  0.777897  0.932047  0.960629
p2  0.945546  0.062647  0.734047
p3  0.247740  0.582076  0.232282
p4  0.078683  0.869736  0.038905
m0  0.399837  0.399837  0.399837
m1  0.890191  0.890191  0.890191
m2  0.580747  0.580747  0.580747
m3  0.354032  0.354032  0.354032
m4  0.329108  0.329108  0.329108    

EDIT1:

Another solution for creating new df with numpy.tile:

dm2 = pd.DataFrame(np.tile(dm.values[:, None], (1, df.shape[1])), dm.index, df.columns)
df.append(dm2)
like image 180
jezrael Avatar answered Oct 16 '22 05:10

jezrael