Consider the dataframe df
df = pd.DataFrame(np.random.rand(5, 3),
['p0', 'p1', 'p2', 'p3', 'p4'],
['A', 'B', 'C'])
df
And the mean of every row:
dm = df.mean(1)
dm.index = ['m0', 'm1', 'm2', 'm3', 'm4']
dm
m0 0.352396
m1 0.606469
m2 0.643022
m3 0.560809
m4 0.776058
dtype: float64
How do I append this series of means to every column of df
. I expect the results to look like:
Also, since this will be applied at scale, time is of the essence.
What I used to generate the expected output is:
pd.concat([df, pd.DataFrame({c: dm for c, i in df.iteritems()})])
Please note that we can only append a series or list/tuple of series to the existing series. Step1: Define a Pandas series, s1. Step 2: Define another series, s2. Step 3: Append s2 to s1. Step 4: Print the final appended series.
The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Pandas Series.append () function is used to concatenate two or more series object. Syntax: Series.append (to_append, ignore_index=False, verify_integrity=False)
Python | Pandas Series.append() Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index.
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Python Series.add () is used to add series or list like objects with same length to the caller series.
You can use double concat
:
print (pd.concat([dm] * df.shape[1], axis=1, keys=df.columns))
A B C
m0 0.823788 0.823788 0.823788
m1 0.615354 0.615354 0.615354
m2 0.606740 0.606740 0.606740
m3 0.386629 0.386629 0.386629
m4 0.637147 0.637147 0.637147
print (pd.concat([df, pd.concat([dm] * df.shape[1], axis=1, keys=df.columns)]))
A B C
p0 0.789966 0.699837 0.981560
p1 0.415609 0.469310 0.961144
p2 0.920938 0.476615 0.422665
p3 0.323782 0.805231 0.030874
p4 0.761674 0.361134 0.788632
m0 0.823788 0.823788 0.823788
m1 0.615354 0.615354 0.615354
m2 0.606740 0.606740 0.606740
m3 0.386629 0.386629 0.386629
m4 0.637147 0.637147 0.637147
For creating appended df
is possible use numpy repeat
and numpy.newaxis
:
x = dm.values
print (pd.DataFrame(np.repeat(x[:, np.newaxis], df.shape[1], 1),
columns=df.columns,
index=dm.index))
A B C
m0 0.399837 0.399837 0.399837
m1 0.890191 0.890191 0.890191
m2 0.580747 0.580747 0.580747
m3 0.354032 0.354032 0.354032
m4 0.329108 0.329108 0.329108
print(pd.concat([df, pd.DataFrame(np.repeat(x[:, np.newaxis], df.shape[1], 1),
columns=df.columns,
index=dm.index)]))
A B C
p0 0.087337 0.375891 0.736282
p1 0.777897 0.932047 0.960629
p2 0.945546 0.062647 0.734047
p3 0.247740 0.582076 0.232282
p4 0.078683 0.869736 0.038905
m0 0.399837 0.399837 0.399837
m1 0.890191 0.890191 0.890191
m2 0.580747 0.580747 0.580747
m3 0.354032 0.354032 0.354032
m4 0.329108 0.329108 0.329108
EDIT1:
Another solution for creating new df
with numpy.tile
:
dm2 = pd.DataFrame(np.tile(dm.values[:, None], (1, df.shape[1])), dm.index, df.columns)
df.append(dm2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With