<p>Suppose I have a column like so:</p> <pre class="prettyprint"><code>a b 1 5 1 7 2 3 1 3 2 5 </code></pre> <p>I want to sum up the values for <code>b</code> where <code>a = 1</code>, for example. This would give me <code>5 + 7 + 3 = 15</code>.</p> <p>How do I do this in pandas?</p>

<p>The essential idea here is to select the data you want to sum, and then sum them. This selection of data can be done in several different ways, a few of which are shown below.</p> <h3>Boolean indexing</h3> <p>Arguably the most common way to select the values is to use Boolean indexing. </p> <p>With this method, you find out where column 'a' is equal to <code>1</code> and then sum the corresponding rows of column 'b'. You can use <code>loc</code> to handle the indexing of rows and columns:</p> <pre class="prettyprint"><code>>>> df.loc[df['a'] == 1, 'b'].sum() 15 </code></pre> <p>The Boolean indexing can be extended to other columns. For example if <code>df</code> also contained a column 'c' and we wanted to sum the rows in 'b' where 'a' was 1 and 'c' was 2, we'd write:</p> <pre class="prettyprint"><code>df.loc[(df['a'] == 1) & (df['c'] == 2), 'b'].sum() </code></pre> <h3>Query</h3> <p>Another way to select the data is to use <code>query</code> to filter the rows you're interested in, select column 'b' and then sum:</p> <pre class="prettyprint"><code>>>> df.query("a == 1")['b'].sum() 15 </code></pre> <p>Again, the method can be extended to make more complicated selections of the data:</p> <pre class="prettyprint"><code>df.query("a == 1 and c == 2")['b'].sum() </code></pre> <p>Note this is a little more concise than the Boolean indexing approach.</p> <h3>Groupby</h3> <p>The alternative approach is to use <code>groupby</code> to split the DataFrame into parts according to the value in column 'a'. You can then sum each part and pull out the value that the 1s added up to:</p> <pre class="prettyprint"><code>>>> df.groupby('a')['b'].sum()[1] 15 </code></pre> <p>This approach is likely to be slower than using Boolean indexing, but it is useful if you want check the sums for other values in column <code>a</code>:</p> <pre class="prettyprint"><code>>>> df.groupby('a')['b'].sum() a 1 15 2 8 </code></pre>

How do I sum values in a column that match a given condition using pandas?

Tags:

python

pandas

dataframe

data-analysis

Suppose I have a column like so:

a   b   1   5    1   7 2   3 1   3 2   5

I want to sum up the values for b where a = 1, for example. This would give me 5 + 7 + 3 = 15.

How do I do this in pandas?

589

asked Jan 30 '15 12:01

adijo

1 Answers

The essential idea here is to select the data you want to sum, and then sum them. This selection of data can be done in several different ways, a few of which are shown below.

Boolean indexing

Arguably the most common way to select the values is to use Boolean indexing.

With this method, you find out where column 'a' is equal to 1 and then sum the corresponding rows of column 'b'. You can use loc to handle the indexing of rows and columns:

>>> df.loc[df['a'] == 1, 'b'].sum() 15

The Boolean indexing can be extended to other columns. For example if df also contained a column 'c' and we wanted to sum the rows in 'b' where 'a' was 1 and 'c' was 2, we'd write:

df.loc[(df['a'] == 1) & (df['c'] == 2), 'b'].sum()

Query

Another way to select the data is to use query to filter the rows you're interested in, select column 'b' and then sum:

>>> df.query("a == 1")['b'].sum() 15

Again, the method can be extended to make more complicated selections of the data:

df.query("a == 1 and c == 2")['b'].sum()

Note this is a little more concise than the Boolean indexing approach.

Groupby

The alternative approach is to use groupby to split the DataFrame into parts according to the value in column 'a'. You can then sum each part and pull out the value that the 1s added up to:

>>> df.groupby('a')['b'].sum()[1] 15

This approach is likely to be slower than using Boolean indexing, but it is useful if you want check the sums for other values in column a:

>>> df.groupby('a')['b'].sum() a 1    15 2     8

122

answered Oct 16 '22 02:10

Alex Riley

Related questions
                            
                                How to append to the end of an empty list?
                            
                                Accepting email address as username in Django
                            
                                GridSpec with shared axes in Python
                            
                                What is the difference between tottime and cumtime on cProfile output?
                            
                                Sort cProfile output by percall when profiling a Python script
                            
                                Use Conda environment in pycharm
                            
                                How to run unittest discover from "python setup.py test"?
                            
                                How do I use Django groups and permissions?
                            
                                Normalizing Unicode
                            
                                Determine complete Django url configuration
                            
                                What is validation data used for in a Keras Sequential model?
                            
                                python: how to convert a valid uuid from String to UUID?
                            
                                Initialize a string variable in Python: "" or None?
                            
                                Why does my recursive function return None?
                            
                                What’s the point of inheritance in Python?
                            
                                Listing the dependencies of a package using pip [duplicate]
                            
                                python: is it possible to attach a console into a running process
                            
                                Avoid Pylint warning E1101: 'Instance of .. has no .. member' for class with dynamic attributes
                            
                                Using tqdm progress bar in a while loop
                            
                                Import arbitrary python source file. (Python 3.3+)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With