I have a row in a Pandas Data frame that contains the sales rate of my items.
A look at my data:
block_combine
Out[78]:
END_MONTH 1 2 3 4 5
Total Listings 168 219 185 89 112
Total Sales 85 85 84 41 46
I can easily calculate the sales % by doing the following:
block_combine.loc["Total Sales Rate"] = block_combine.ix[1,:] / block_combine.ix[0,:]
block_combine
Out[79]:
END_MONTH 1 2 3 4 5
Total Listings 168.000000 219.000000 185.000000 89.000000 112.000000
Total Sales 85.000000 85.000000 84.000000 41.000000 46.000000
Total Sales Rate 0.505952 0.388128 0.454054 0.460674 0.410714
Now what I am attempting to do is to change the "Total Sales Rate" row to whole number percentages. I am able to do this if it was a column however I run into issues when I work with rows.
Here is what I attempted:
block_combine.loc["Total Sales Rate"] = pd.Series(["{0:.0f}%".format(val * 100) for val in block_combine.loc["Total Sales Rate"]])
block_combine
Out[81]: In [82]:
END_MONTH 1 2 3 4 5
Total Listings 168 219 185 89 112.0
Total Sales 85 85 84 41 46.0
Total Sales Rate 39% 45% 46% 41% NaN
The calculations are off/ shifted to the left. The sales rate given for month 1 is actually the sales rate for month 2 (39%)!
Convert Numeric to Percentage String To convert it back to percentage string, we will need to use python's string format syntax '{:. 2%}'. format to add the '%' sign back. Then we use python's map() function to iterate and apply the formatting to all the rows in the 'median_listing_price_yy' column.
You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.
format({'var1': "{:. 2f}",'var2': "{:. 2f}",'var3': "{:. 2%}"}) Thanks!
Use pct_change() to Calculate Percentage Change in Pandas periods - having default value 1 . It specifies the periods to shift to calculate the percent change.
You could use .apply('{:.0%}'.format)
:
import pandas as pd
df = pd.DataFrame([(168,219,185,89,112), (85,85,84,41,46)],
index=['Total Listings', 'Total Sales'], columns=list(range(1,6)))
df.loc['Total Sales Rate'] = ((df.loc['Total Sales']/df.loc['Total Listings'])
.apply('{:.0%}'.format))
print(df)
yields
1 2 3 4 5
Total Listings 168 219 185 89 112
Total Sales 85 85 84 41 46
Total Sales Rate 51% 39% 45% 46% 41%
Notice that the Python str.format
method has a built-in %
format which multiplies the number by 100 and displays in fixed ('f') format, followed by a percent sign.
It is important to be aware that Pandas DataFrame columns must have a single dtype. Changing one value to a string forces the entire column to change
its dtype to the generic object
dtype. Thus the int64
s or int32
s in the
Total Listings
and Total Sales
rows get recast as plain Python ints
. This
prevents Pandas from taking advantage of fast NumPy-based numerical operations
which only work on native NumPy dtypes (like int64
or float64
-- not
object
).
So while the above code achieves the desired look, it isn't advisable to use this if further computation is to be done on the DataFrame. Instead, only convert to strings at the end if you need to do so for presentation.
Or, alternatively, transpose your DataFrame so the Total Sales Rate
strings are in a column, not a row:
import pandas as pd
df = pd.DataFrame([(168,219,185,89,112), (85,85,84,41,46)],
index=['Total Listings', 'Total Sales'], columns=list(range(1,6))).T
df['Total Sales Rate'] = ((df['Total Sales']/df['Total Listings'])
.apply('{:.0%}'.format))
print(df)
yields
Total Listings Total Sales Total Sales Rate
1 168 85 51%
2 219 85 39%
3 185 84 45%
4 89 41 46%
5 112 46 41%
The reason why
block_combine.loc["Total Sales Rate"] = pd.Series(["{0:.0f}%".format(val * 100) for val in block_combine.loc["Total Sales Rate"]])
shifted the values to the left by one column is because the new Series has an index which starts at 0 not 1. Pandas aligns the index of the Series on the right with the index of block_combine.loc["Total Sales Rate"]
before assigning values to block_combine.loc["Total Sales Rate"]
.
Thus, you could alternatively have used:
block_combine.loc["Total Sales Rate"] = pd.Series(["{0:.0f}%".format(val * 100)
for val in block_combine.loc["Total Sales Rate"]],
index=block_combine.columns)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With