Say I have this data:
project: group: sum:
A John 12
A Sam 10
B Sun 4
B Toy 5
B Joy 7
C Sam 11
The data is in data set frame_main. I wanted to sum up by project so I did:
result_main = pd.concat(frame_main).groupby(["project","group"]).sum()
It basically doing what I wanted, which is summing up the third column and group by the first:
project: group: sum:
A John 12
Sam 10
B Sun 4
Toy 5
Joy 7
C Sam 11
But now when I'm trying to print it using the following:
print(tabulate(result_main, headers="keys", tablefmt='psql'))
It prints like that:
+---------------------------+-----------------+
| | sum: |
|---------------------------+-----------------|
| ('A', 'John') | 12 |
| ('A', 'Sam') | 10 |
| ('B', 'Sun') | 4 |
| ('B', 'Toy') | 5 |
| ('B', 'Joy') | 7 |
| ('C', 'Sam') | 11 |
How can I print so it would look like the output above? I need 3 columns and grouped by the first.
If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .
To get the column names in Pandas dataframe you can type <code>print(df. columns)</code> given that your dataframe is named “df”.
Much like @Craig we can mask those duplicate value in 'project:' column.
df_sum = df_sum.reset_index()
df_sum['project:'] = df_sum['project:'].mask(df_sum['project:'].duplicated(),'')
print(df_sum.set_index('project:').to_markdown(tablefmt='psql'))
Output:
+------------+----------+--------+
| project: | group: | sum: |
|------------+----------+--------|
| A | John | 12 |
| | Sam | 10 |
| B | Sun | 4 |
| | Toy | 5 |
| | Joy | 7 |
| C | Sam | 11 |
+------------+----------+--------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With