Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NaNs as key in dictionaries

Tags:

Can anyone explain the following behaviour to me?

>>> import numpy as np >>> {np.nan: 5}[np.nan] 5 >>> {float64(np.nan): 5}[float64(np.nan)] KeyError: nan 

Why does it work in the first case, but not in the second? Additionally, I found that the following DOES work:

>>> a ={a: 5}[a] float64(np.nan) 
like image 336
hamogu Avatar asked Jun 22 '11 14:06

hamogu


People also ask

What can be used as a key in a dictionary?

Second, a dictionary key must be of a type that is immutable. For example, you can use an integer, float, string, or Boolean as a dictionary key. However, neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.

Can dictionaries be keys in dictionaries?

Keys are unique within a dictionary while values may not be. The values of a dictionary can be of any type, but the keys must be of an immutable data type such as strings, numbers, or tuples.

Can we use string as key in dictionary?

The dictionary webstersDict used strings as keys in the dictionary, but dictionary keys can be any immutable data type (numbers, strings, tuples etc). Dictionary values can be just about anything (int, lists, functions, strings, etc).

Can a dictionary have an object as a key?

A dictionary's keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.


1 Answers

The problem here is that NaN is not equal to itself, as defined in the IEEE standard for floating point numbers:

>>> float("nan") == float("nan") False 

When a dictionary looks up a key, it roughly does this:

  1. Compute the hash of the key to be looked up.

  2. For each key in the dict with the same hash, check if it matches the key to be looked up. This check consists of

    a. Checking for object identity: If the key in the dictionary and the key to be looked up are the same object as indicated by the is operator, the key was found.

    b. If the first check failed, check for equality using the __eq__ operator.

The first example succeeds, since np.nan and np.nan are the same object, so it does not matter they don't compare equal:

>>> numpy.nan is numpy.nan True 

In the second case, np.float64(np.nan) and np.float64(np.nan) are not the same object -- the two constructor calls create two distinct objects:

>>> numpy.float64(numpy.nan) is numpy.float64(numpy.nan) False 

Since the objects also do not compare equal, the dictionary concludes the key is not found and throws a KeyError.

You can even do this:

>>> a = float("nan") >>> b = float("nan") >>> {a: 1, b: 2} {nan: 1, nan: 2} 

In conclusion, it seems a saner idea to avoid NaN as a dictionary key.

like image 200
Sven Marnach Avatar answered Nov 12 '22 08:11

Sven Marnach