Say I have this data:
project:  group:  sum:
A         John    12
A         Sam     10
B         Sun     4
B         Toy     5
B         Joy     7
C         Sam     11
The data is in data set frame_main. I wanted to sum up by project so I did:
result_main = pd.concat(frame_main).groupby(["project","group"]).sum()
It basically doing what I wanted, which is summing up the third column and group by the first:
project:  group:  sum:
A         John    12
          Sam     10
B         Sun     4
          Toy     5
          Joy     7
C         Sam     11
But now when I'm trying to print it using the following:
print(tabulate(result_main, headers="keys", tablefmt='psql'))
It prints like that:
+---------------------------+-----------------+                                                                                       
|                           |   sum:          |                                                                                       
|---------------------------+-----------------|                                                                                       
| ('A', 'John')             |             12  |                                                                                       
| ('A', 'Sam')              |             10  |                                                                                       
| ('B', 'Sun')              |             4   |
| ('B', 'Toy')              |             5   |                                                                                       
| ('B', 'Joy')              |             7   |                                                                                       
| ('C', 'Sam')              |             11  |
How can I print so it would look like the output above? I need 3 columns and grouped by the first.
If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .
To get the column names in Pandas dataframe you can type <code>print(df. columns)</code> given that your dataframe is named “df”.
Much like @Craig we can mask those duplicate value in 'project:' column.
df_sum = df_sum.reset_index()
df_sum['project:'] = df_sum['project:'].mask(df_sum['project:'].duplicated(),'')
print(df_sum.set_index('project:').to_markdown(tablefmt='psql'))
Output:
+------------+----------+--------+
| project:   | group:   |   sum: |
|------------+----------+--------|
| A          | John     |     12 |
|            | Sam      |     10 |
| B          | Sun      |      4 |
|            | Toy      |      5 |
|            | Joy      |      7 |
| C          | Sam      |     11 |
+------------+----------+--------+
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With