Currently when I have to add a constant column to an existing data frame, I do the following. To me it seems not all that elegant (the part where I multiply by length of dataframe). Wondering if there are better ways of doing this.
import pandas as pd
testdf = pd.DataFrame({'categories': ['bats', 'balls', 'paddles'],
'skus': [50, 5000, 32],
'sales': [500, 700, 90]})
testdf['avg_sales_per_sku'] = [testdf.sales.sum() / testdf.skus.sum()] * len(testdf)
Add a DataFrame Column with constant value using '+' operator. We can use the '+' operator to add a constant number to each element of a DataFrame column. We can assign these new Using this approach you can also append a constant string to each element of string column.
Practical Data Science using Python To add anew column with constant value, use the square bracket i.e. the index operator and set that value.
You can use the assign() function to add a new column to the end of a pandas DataFrame: df = df. assign(col_name=[value1, value2, value3, ...])
You can fill the column implicitly by giving only one number.
testdf['avg_sales_per_sku'] = testdf.sales.sum() / testdf.skus.sum()
From the documentation:
When inserting a scalar value, it will naturally be propagated to fill the column
It seems confusing to me to mix the categorical average with the aggregate average. You could also use:
testdf['avg_sales_per_sku'] = testdf.sales / testdf.skus
testdf['avg_agg_sales_per_agg_sku'] = testdf.sales.sum() / float(testdf.skus.sum()) # float is for Python2
>>> testdf
categories sales skus avg_sales_per_sku avg_agg_sales_per_agg_sku
0 bats 500 50 10.0000 0.253837
1 balls 700 5000 0.1400 0.253837
2 paddles 90 32 2.8125 0.253837
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With