Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Better way to add constant column to pandas data frame

Tags:

python

pandas

Currently when I have to add a constant column to an existing data frame, I do the following. To me it seems not all that elegant (the part where I multiply by length of dataframe). Wondering if there are better ways of doing this.

import pandas as pd

testdf = pd.DataFrame({'categories': ['bats', 'balls', 'paddles'],
                       'skus': [50, 5000, 32],
                       'sales': [500, 700, 90]})

testdf['avg_sales_per_sku'] = [testdf.sales.sum() / testdf.skus.sum()] * len(testdf)
like image 774
Haleemur Ali Avatar asked Mar 30 '15 01:03

Haleemur Ali


People also ask

How do you add a new column to a DataFrame with a constant value?

Add a DataFrame Column with constant value using '+' operator. We can use the '+' operator to add a constant number to each element of a DataFrame column. We can assign these new Using this approach you can also append a constant string to each element of string column.

How do you add a column with the same value in Python?

Practical Data Science using Python To add anew column with constant value, use the square bracket i.e. the index operator and set that value.

How do I add a column of 1 in pandas?

You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])


2 Answers

You can fill the column implicitly by giving only one number.

testdf['avg_sales_per_sku'] = testdf.sales.sum() / testdf.skus.sum() 

From the documentation:

When inserting a scalar value, it will naturally be propagated to fill the column

like image 67
Geeklhem Avatar answered Sep 19 '22 02:09

Geeklhem


It seems confusing to me to mix the categorical average with the aggregate average. You could also use:

testdf['avg_sales_per_sku'] = testdf.sales / testdf.skus
testdf['avg_agg_sales_per_agg_sku'] = testdf.sales.sum() / float(testdf.skus.sum())  # float is for Python2

>>> testdf
  categories  sales  skus  avg_sales_per_sku  avg_agg_sales_per_agg_sku
0       bats    500    50            10.0000                   0.253837
1      balls    700  5000             0.1400                   0.253837
2    paddles     90    32             2.8125                   0.253837
like image 25
Alexander Avatar answered Sep 18 '22 02:09

Alexander