I'm trying to use the Python profiler to speed up my code. I've been able to identify the specific function where nearly all of the time is spent, but I can't figure out where in that function the time is being spent.
Below I have the profile output, which shows that "appendBallot" is the primary culprit and consumes nearly 116 seconds. Further below, I have the code for "appendBallot".
I cannot figure out from the profile output, which part of "appendBallot" I need to optimize as the next highest time entry is less than a second. I'm sure many of you could tell me just from my code, but I'd like to understand how to get that information from the profile output. Any help would be greatly appreciated.
Profile output:
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 116.168 116.168 <string>:1(<module>)
1 0.001 0.001 116.168 116.168 {execfile}
1 0.003 0.003 116.167 116.167 foo.py:1(<module>)
1 0.000 0.000 116.139 116.139 ballots.py:330(loadKnown)
1 0.000 0.000 116.109 116.109 plugins.py:148(load)
1 0.196 0.196 116.108 116.108 BltBallotLoader.py:37(loadFile)
100000 114.937 0.001 115.912 0.001 ballots.py:133(appendBallot)
100000 0.480 0.000 0.790 0.000 ballots.py:117(newBallot)
316668 0.227 0.000 0.310 0.000 ballots.py:107(getNumCandidates)
417310/417273 0.111 0.000 0.111 0.000 {len}
200510 0.071 0.000 0.071 0.000 {method 'append' of 'list' objects}
99996 0.045 0.000 0.045 0.000 {method 'add' of 'set' objects}
100000 0.042 0.000 0.042 0.000 {method 'has_key' of 'dict' objects}
1 0.000 0.000 0.030 0.030 plugins.py:202(getLoaderPluginClasses)
1 0.000 0.000 0.030 0.030 plugins.py:179(getPluginClasses)
1 0.000 0.000 0.030 0.030 plugins.py:205(getLoaderPluginClass)
3 0.016 0.005 0.029 0.010 {__import__}
1 0.022 0.022 0.025 0.025 ballots.py:1(<module>)
1 0.010 0.010 0.013 0.013 BltBallotLoader.py:1(<module>)
7 0.000 0.000 0.003 0.000 re.py:227(_compile)
Code:
def appendBallot(self, ballot, ballotID=None):
"Append a ballot to this Ballots object."
# String representation of ballot for determining whether ballot is unique
ballotString = str(list(ballot))
# Ballot as the appropriate array to conserve memory
ballot = self.newBallot(ballot)
# Assign a ballot ID if one has not been given
if ballotID is None:
ballotID = len(self.ballotIDs)
assert(ballotID not in self.ballotIDs)
self.ballotIDs.append(ballotID)
# Check to see if we have seen this ballot before
if self.uniqueBallotsLookup.has_key(ballotString):
i = self.uniqueBallotsLookup[ballotString]
self.uniqueBallotIDs[i].add(ballotID)
else:
i = len(self.uniqueBallots)
self.uniqueBallotsLookup[ballotString] = i
self.uniqueBallots.append(ballot)
self.uniqueBallotIDs.append(set([ballotID]))
self.ballotOrder.append(i)
Using cProfile: Python includes a built in module called cProfile which is used to measure the execution time of a program. cProfiler module provides all information about how long the program is executing and how many times the function get called in a program.
Profiling is a technique to figure out how time is spent in a program. With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may hint at a bug in the program as well.
The syntax is cProfile. run(statement, filename=None, sort=-1) . You can pass python code or a function name that you want to profile as a string to the statement argument. If you want to save the output in a file, it can be passed to the filename argument.
Yeah I came across that same problem as well.
The only way I know to work around this is to wrap your large function into several smaller function calls. This will allow the profiler to take into account each of the smaller function calls.
Interesting enough, the process of doing this (for me, anyway) made it obvious where the inefficiencies were, so I didn't even have to run the profiler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With