I have a list of dictionaries of the following form: <pre class="prettyprint"><code>lst = [{"Name":'Nick','Hour':0,'Value':2.75}, {"Name":'Sam','Hour':1,'Value':7.0}, {"Name":'Nick','Hour':0,'Value':2.21}, {'Name':'Val',"Hour":1,'Value':10.1}, {'Name':'Nick','Hour':1,'Value':2.1}, {'Name':'Val',"Hour":1,'Value':11},] </code></pre> I want to be able to sum all values for a name for a particular hour, e.g. if <code>Name == Nick and Hour == 0</code>, I want value to give me the sum of all values meeting the condition. <code>2.75 + 2.21</code>, according to the piece above. I have already tried the following but it doesn't help me out with both conditions. <pre class="prettyprint"><code>finalList = collections.defaultdict(float) for info in lst: finalList[info['Name']] += info['Value'] finalList = [{'Name': c, 'Value': finalList[c]} for c in finalList] </code></pre> This sums up all the values for a particular <code>Name</code>, not checking if the <code>Hour</code> was the same. How can I incorporate that condition into my code as well? My expected output : <pre class="prettyprint"><code>finalList = [{"Name":'Nick','Hour':0,'Value':4.96}, {"Name":'Sam','Hour':1,'Value':7.0}, {'Name':'Val',"Hour":1,'Value':21.1}, {'Name':'Nick','Hour':1,'Value':2.1}...] </code></pre>

consider using pandas module - it's very comfortable for such data sets: <pre class="prettyprint"><code>import pandas as pd In [109]: lst Out[109]: [{'Hour': 0, 'Name': 'Nick', 'Value': 2.75}, {'Hour': 1, 'Name': 'Sam', 'Value': 7.0}, {'Hour': 0, 'Name': 'Nick', 'Value': 2.21}, {'Hour': 1, 'Name': 'Val', 'Value': 10.1}, {'Hour': 1, 'Name': 'Nick', 'Value': 2.1}] In [110]: df = pd.DataFrame(lst) In [111]: df Out[111]: Hour Name Value 0 0 Nick 2.75 1 1 Sam 7.00 2 0 Nick 2.21 3 1 Val 10.10 4 1 Nick 2.10 In [123]: df.groupby(['Name','Hour']).sum().reset_index() Out[123]: Name Hour Value 0 Nick 0 4.96 1 Nick 1 2.10 2 Sam 1 7.00 3 Val 1 10.10 </code></pre> export it to CSV: <pre class="prettyprint"><code>df.groupby(['Name','Hour']).sum().reset_index().to_csv('/path/to/file.csv', index=False) </code></pre> result: <pre class="prettyprint"><code>Name,Hour,Value Nick,0,4.96 Nick,1,2.1 Sam,1,7.0 Val,1,10.1 </code></pre> if you want to have it as a dictionary: <pre class="prettyprint"><code>In [125]: df.groupby(['Name','Hour']).sum().reset_index().to_dict('r') Out[125]: [{'Hour': 0, 'Name': 'Nick', 'Value': 4.96}, {'Hour': 1, 'Name': 'Nick', 'Value': 2.1}, {'Hour': 1, 'Name': 'Sam', 'Value': 7.0}, {'Hour': 1, 'Name': 'Val', 'Value': 10.1}] </code></pre> you can do many fancy things using pandas: <pre class="prettyprint"><code>In [112]: df.loc[(df.Name == 'Nick') & (df.Hour == 0), 'Value'].sum() Out[112]: 4.96 In [121]: df.groupby('Name')['Value'].agg(['sum','mean']) Out[121]: sum mean Name Nick 7.06 2.353333 Sam 7.00 7.000000 Val 10.10 10.100000 </code></pre>

Python sum values of list of dictionaries if two other key value pairs match

Tags:

python

I have a list of dictionaries of the following form:

lst = [{"Name":'Nick','Hour':0,'Value':2.75},
       {"Name":'Sam','Hour':1,'Value':7.0},
       {"Name":'Nick','Hour':0,'Value':2.21},
       {'Name':'Val',"Hour":1,'Value':10.1},
       {'Name':'Nick','Hour':1,'Value':2.1},  
       {'Name':'Val',"Hour":1,'Value':11},]

I want to be able to sum all values for a name for a particular hour, e.g. if Name == Nick and Hour == 0, I want value to give me the sum of all values meeting the condition. 2.75 + 2.21, according to the piece above.

I have already tried the following but it doesn't help me out with both conditions.

finalList = collections.defaultdict(float)
for info in lst:
    finalList[info['Name']] += info['Value']
finalList = [{'Name': c, 'Value': finalList[c]} for c in finalList]

This sums up all the values for a particular Name, not checking if the Hour was the same. How can I incorporate that condition into my code as well?

My expected output :

finalList = [{"Name":'Nick','Hour':0,'Value':4.96},
       {"Name":'Sam','Hour':1,'Value':7.0},
       {'Name':'Val',"Hour":1,'Value':21.1},
       {'Name':'Nick','Hour':1,'Value':2.1}...]

287

asked Aug 09 '16 04:08

Blabber

1 Answers

consider using pandas module - it's very comfortable for such data sets:

import pandas as pd

In [109]: lst
Out[109]:
[{'Hour': 0, 'Name': 'Nick', 'Value': 2.75},
 {'Hour': 1, 'Name': 'Sam', 'Value': 7.0},
 {'Hour': 0, 'Name': 'Nick', 'Value': 2.21},
 {'Hour': 1, 'Name': 'Val', 'Value': 10.1},
 {'Hour': 1, 'Name': 'Nick', 'Value': 2.1}]

In [110]: df = pd.DataFrame(lst)

In [111]: df
Out[111]:
   Hour  Name  Value
0     0  Nick   2.75
1     1   Sam   7.00
2     0  Nick   2.21
3     1   Val  10.10
4     1  Nick   2.10

In [123]: df.groupby(['Name','Hour']).sum().reset_index()
Out[123]:
   Name  Hour  Value
0  Nick     0   4.96
1  Nick     1   2.10
2   Sam     1   7.00
3   Val     1  10.10

export it to CSV:

df.groupby(['Name','Hour']).sum().reset_index().to_csv('/path/to/file.csv', index=False)

result:

Name,Hour,Value
Nick,0,4.96
Nick,1,2.1
Sam,1,7.0
Val,1,10.1

if you want to have it as a dictionary:

In [125]: df.groupby(['Name','Hour']).sum().reset_index().to_dict('r')
Out[125]:
[{'Hour': 0, 'Name': 'Nick', 'Value': 4.96},
 {'Hour': 1, 'Name': 'Nick', 'Value': 2.1},
 {'Hour': 1, 'Name': 'Sam', 'Value': 7.0},
 {'Hour': 1, 'Name': 'Val', 'Value': 10.1}]

you can do many fancy things using pandas:

In [112]: df.loc[(df.Name == 'Nick') & (df.Hour == 0), 'Value'].sum()
Out[112]: 4.96


In [121]: df.groupby('Name')['Value'].agg(['sum','mean'])
Out[121]:
        sum       mean
Name
Nick   7.06   2.353333
Sam    7.00   7.000000
Val   10.10  10.100000

answered Nov 14 '22 22:11

MaxU - stop WAR against UA

Related questions
                            
                                Pyspark RDD: find index of an element
                            
                                Hashingvectorizer and Multinomial naive bayes are not working together
                            
                                Use an object method with the Initializer (Same line)
                            
                                Proper python way to clear a bytearray
                            
                                Plotting csv file data to line graph using matplotlib
                            
                                Python/Keras - accessing ModelCheckpoint callback
                            
                                NetworkX: adjacency matrix does not correspond to graph
                            
                                Is there any way I can call Excel VBA function through Python? [duplicate]
                            
                                Accessing a variable from outer scope via global keyword in Python
                            
                                How to assign columns of data to variables
                            
                                Python: sharing class variables across threads
                            
                                Matplotlib and Ipython-notebook: Displaying exactly the figure that will be saved
                            
                                how to debug or run pytest scripts using eclipse?
                            
                                sort 2d numpy array lexicographically
                            
                                Uploading a file via pyCurl
                            
                                Pyspark Dataframe Join using UDF
                            
                                Group by, count and calculate proportions in pandas?
                            
                                How to get django queryset results with formatted datetime field
                            
                                Sort list of dicts by multiple values [duplicate]
                            
                                Python pandas remove rows where multiple conditions are not met

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With