Edit 2: It was suggested that this is a copy of a similar question. I'd disagree since my question focuses on speed, while the other question asks what is more "readable" or "better" (without defining better). While the questions are similar, there is a big difference in the discussion/answers given.
EDIT: I realise from the questions that I could have been clearer. Sorry for code typos, yes it should be using the proper python operator for addition.
Regarding the input data, I just chose a list of random numbers since that's a common sample. In my case I'm using a dict where I expect a lot of keyerrors, probably 95% of the keys will not exist, and the few that exist will contain clusters of data.
I'm interested in a general discussion though, regardless of the input data set, but of course samples with running times are interesting.
My standard approach would be like so many other posts to write something like
list = (100 random numbers)
d = {}
for x in list:
if x in d:
d[x]+=1
else:
d[x]=1
But I just came to think of this being faster, since we dont have to check if the dictionary contains the key. We just assume it does, and if not, we handle that. Is there any difference or is Python smarter than I am?
list = (100 random numbers)
d = {}
for x in list:
try:
d[x]+=1
except KeyError:
d[x] = 1
The same approach with indexes in an array, out of bounds, negative indexes etc.
There is another point when it comes to coding style. As it's common python coding style to use EAFP (Easier to ask for forgiveness than permission) which assumes the existence of valid keys and catches exceptions if the assumption proves false.
Due this common coding style I've always used the try/except approach and was sure that this is faster than LBYL style (Look before you leap). As I learned by the answers here it definitely depends. As long as you can expect an existing key I would go for the try/except approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With