I have a list such as <pre class="prettyprint"><code>groups = [['Group1', 'A', 'B'], ['Group2', 'C', 'D']] </code></pre> and a dataframe such as <pre class="prettyprint"><code>A 100 B 200 C 300 D 400 </code></pre> I want to make a group sum from the list above to become: <pre class="prettyprint"><code>Group 1 300 Group 2 700 </code></pre> How can I do this using python pandas? Needless to say I am a newbie in pandas. Thanks.

You need create <code>dict</code> by <code>lists</code> and then <code>groupby</code> and aggregating <code>sum</code>: <pre class="prettyprint"><code>df = pd.DataFrame({'a': ['A', 'B', 'C', 'D'], 'b': [100, 200, 300, 400]}) print (df) a b 0 A 100 1 B 200 2 C 300 3 D 400 groups = [['Group1', 'A', 'B'], ['Group2', 'C', 'D']] #http://stackoverflow.com/q/43227103/2901002 d = {k:row[0] for row in groups for k in row[1:]} print (d) {'B': 'Group1', 'C': 'Group2', 'D': 'Group2', 'A': 'Group1'} print (df.set_index('a').groupby(d).sum()) b Group1 300 Group2 700 </code></pre> Is possible a bit modify solution - if where only column <code>b</code> is aggregate by <code>sum</code>. Last <code>reset_index</code> for convert index to column. <pre class="prettyprint"><code>df1 = df.set_index('a').groupby(pd.Series(d, name='a'))['b'].sum().reset_index() print (df1) a b 0 Group1 300 1 Group2 700 df2 = df.groupby(df['a'].map(d))['b'].sum().reset_index() print (df2) a b 0 Group1 300 1 Group2 700 </code></pre>

Making a group in dataframe in pandas

I have a list such as

groups = [['Group1', 'A', 'B'], ['Group2', 'C', 'D']]

and a dataframe such as

I want to make a group sum from the list above to become:

Group 1 300
Group 2 700

How can I do this using python pandas? Needless to say I am a newbie in pandas. Thanks.

How do you create a DataFrame group?

A DataFrame may be grouped by a combination of columns and index levels by specifying the column names as strings and the index levels as pd. Grouper objects. The following example groups df by the second index level and the A column.

How do I Group A pandas DataFrame by multiple columns?

pandas GroupBy Multiple Columns Example Most of the time when you are working on a real-time project in pandas DataFrame you are required to do groupby on multiple columns. You can do so by passing a list of column names to DataFrame. groupby() function.

What is group by () in pandas library?

Pandas groupby is used for grouping the data according to the categories and apply a function to the categories. It also helps to aggregate data efficiently. Pandas dataframe. groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes.

How do I group specific rows in pandas?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.

You need create dict by lists and then groupby and aggregating sum:

df = pd.DataFrame({'a': ['A', 'B', 'C', 'D'], 'b': [100, 200, 300, 400]})
print (df)
   a    b
0  A  100
1  B  200
2  C  300
3  D  400

groups = [['Group1', 'A', 'B'], ['Group2', 'C', 'D']]

#http://stackoverflow.com/q/43227103/2901002
d = {k:row[0] for row in groups for k in row[1:]}
print (d)
{'B': 'Group1', 'C': 'Group2', 'D': 'Group2', 'A': 'Group1'}

print (df.set_index('a').groupby(d).sum())
          b
Group1  300
Group2  700

Is possible a bit modify solution - if where only column b is aggregate by sum. Last reset_index for convert index to column.

df1 = df.set_index('a').groupby(pd.Series(d, name='a'))['b'].sum().reset_index()
print (df1)
        a    b
0  Group1  300
1  Group2  700

df2 = df.groupby(df['a'].map(d))['b'].sum().reset_index()
print (df2)
        a    b
0  Group1  300
1  Group2  700

Another option...but seems @jezrael's way is better!

import pandas as pd

groups = [['Group1', 'A', 'B'], ['Group2', 'C', 'D']]

df0 = pd.melt(pd.DataFrame(groups).set_index(0).T)
df1 = pd.read_clipboard(header=None)  # Your example data

df = df1.merge(df0, left_on=0, right_on='value')[['0_y', 1]]
df.columns = ['Group', 'Value']

print df.groupby('Group').sum()


        Value
Group        
Group1    300
Group2    700

Making a group in dataframe in pandas

Tags:

python

pandas

dataframe

Caglar

People also ask

2 Answers

jezrael

su79eu7k

Recent Activity

Donate For Us

Making a group in dataframe in pandas

Tags:

python

pandas

dataframe

Caglar

People also ask

2 Answers

jezrael

su79eu7k

Related questions

Recent Activity

Donate For Us