Consider the numpy.array
i
i = np.empty((1,), dtype=object)
i[0] = [1, 2]
i
array([list([1, 2])], dtype=object)
Example 1index
df = pd.DataFrame([1], index=i)
df
0
[1, 2] 1
Example 2columns
But
df = pd.DataFrame([1], columns=i)
Leads to this when I display it
df
TypeError: unhashable type: 'list'
However, df.T
works!?
Question
Why is it necessary for index values to be hashable in a column context but not in an index context? And why only when it's displayed?
This is because of how pandas internally determines the string representation of the DataFrame
object. Essentially, the difference between column labels and index labels here is that the column determines the format of the string representation (as the column could be a float, int, etc.).
The error thus happens because pandas stores a separate formatter object for each column in a dictionary and this object is retrieved using the column name. Specifically, the line that triggers the error is https://github.com/pandas-dev/pandas/blob/d1accd032b648c9affd6dce1f81feb9c99422483/pandas/io/formats/format.py#L420
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With