Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to safely count dictionary keys in python [duplicate]

Tags:

python

django

I'm writing a django application where I will get a dictionary from the user that can be of variable size. I want to have a limit for how big the dictionary can be, i.e. how many (key, value) pairs it can hold. I want it to be no bigger than 200. I suspect that if I do:

if len(user_dict)>200:
    raise ValidationError("dict has too many (key, value) pairs")

python will have to count the whole dict. If the dict is huge, because of a malicious user, this will eat unnecessary processing power. Or does the dict keep track of how many objects it holds, meaning len(user_dict) is a simple lookup operation? What is the best way to solve this problem?

I was thinking something like:

i=0
for key in user_dict.keys():
    i += 1
    if i>200:
        raise ValidationError("dict has too many (key, value) pairs")
like image 503
Sahand Avatar asked Jan 04 '18 13:01

Sahand


People also ask

Can dictionary keys have duplicates Python?

Dictionaries in Python First, a given key can appear in a dictionary only once. Duplicate keys are not allowed.

What will happen if a dictionary has duplicate keys?

Dictionary literals are constructed in the order given in the source. This means that if a key is duplicated the second key-value pair will overwrite the first as a dictionary can only have one value per key.

How many identical keys can a dictionary have in Python?

Answer. No, each key in a dictionary should be unique. You can't have two keys with the same value. Attempting to use the same key again will just overwrite the previous value stored.


1 Answers

Or does the dict keep track of how many objects it holds, meaning len(user_dict) is a simple lookup operation?

A dictionary - given a serious interpreter implementation like CPython - indeed keeps track of the number of key-value pairs that are stored in the dictionary. So if user_dict is indeed a dictionary, then len(user_dict) works in O(1) and is very fast. The fact that it works in constant time also means that it makes no (theoretical) difference whether we calculate the len(..) of a dict object with 100k items, or none at all.

No iteration is necessary to count the number of objects. For instance the CPython source code for the dict class has:

static Py_ssize_t
dict_length(PyDictObject *mp)
{
    return mp->ma_used;
}

So it returns the ma_used field of the dictionary object (which is thus a field containing the number of items in the dictionary).

This is also described in this file:

Dictionaries: dict and defaultdict
                               Complexity
Operation     | Example      | Class         | Notes
--------------+--------------+---------------+-------------------------------
Index         | d[k]         | O(1)      |
Store         | d[k] = v     | O(1)      |
Length        | len(d)       | O(1)      |
Delete        | del d[k]     | O(1)      |
get/setdefault| d.method     | O(1)      |
Pop           | d.pop(k)     | O(1)      |
Pop item      | d.popitem()  | O(1)      |
Clear         | d.clear()    | O(1)      | similar to s = {} or = dict()
View          | d.keys()     | O(1)      | same for d.values()

Construction  | dict(...)    | O(len(...))   | depends # (key,value) 2-tuples

Iteration     | for k in d:  | O(N)          | all forms: keys, values, items
like image 193
Willem Van Onsem Avatar answered Nov 14 '22 21:11

Willem Van Onsem