I have a .csv file with many rows and 3 columns: Date, Rep, and Sales. I would like to use Python to generate a new array that groups the data by Date and, for the given date, sorts the Reps by Sales. As an example, my input data looks like this:
salesData = [[201703,'Bob',3000], [201703,'Sarah',6000], [201703,'Jim',9000],
[201704,'Bob',8000], [201704,'Sarah',7000], [201704,'Jim',12000],
[201705,'Bob',15000], [201705,'Sarah',14000], [201705,'Jim',8000],
[201706,'Bob',10000], [201706,'Sarah',18000]]
My desired output would look like this:
sortedData = [[201703,'Jim', 'Sarah', 'Bob'], [201704,'Jim', 'Bob',
'Sarah'], [201705,'Bob', 'Sarah', 'Jim'], [201706, 'Sarah', 'Bob']]
I am new to Python, but I have searched quite a bit for a solution with no success. Most of my search results lead me to believe there may be an easy way to do this using pandas (which I have not used) or numpy (which I have used).
Any suggestions would be greatly appreciated. I am using Python 3.6.
Use Pandas!
import pandas as pd
salesData = [[201703, 'Bob', 3000], [201703, 'Sarah', 6000], [201703, 'Jim', 9000],
[201704, 'Bob', 8000], [201704, 'Sarah', 7000], [201704, 'Jim', 12000],
[201705, 'Bob', 15000], [201705, 'Sarah', 14000], [201705, 'Jim', 8000],
[201706, 'Bob', 10000], [201706, 'Sarah', 18000]]
sales_df = pd.DataFrame(salesData)
result = []
for name, group in sales_df.groupby(0):
sorted_df = group.sort_values(2, ascending=False)
result.append([name] + list(sorted_df[1]))
print(result)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With