Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I change the way keys are compared in a Python dict? I want to use the operator 'is' instead of ==

Let's say I have two objects of a same class: objA and objB. Their relationship is the following:

(objA == objB)    #true
(objA is objB)    #false

If I use both objects as keys in a Python dict, then they will be considered as the same key, and overwrite each other. Is there a way to override the dict comparator to use the is comparison instead of == so that the two objects will be seen as different keys in the dict?

Maybe I can override the equals method in the class or something? To be more specific, I am talking about two Tag objects from the BeautifulSoup4 library.

Here's a more specific example of what I am talking about:

from bs4 import BeautifulSoup

HTML_string = "<html><h1>some_header</h1><h1>some_header</h1></html>"

HTML_soup = BeautifulSoup(HTML_string, 'lxml')

first_h1 = HTML_soup.find_all('h1')[0]      #first_h1 = <h1>some_header</h1>
second_h1 = HTML_soup.find_all('h1')[1]     #second_h1 = <h1>some_header</h1>

print(first_h1 == second_h1)        # this prints True
print(first_h1 is second_h1)        # this prints False

my_dict = {}
my_dict[first_h1] = 1
my_dict[second_h1] = 1

print(len(my_dict))                 # my dict has only 1 entry!

# I want to have 2 entries in my_dict: one for key 'first_h1', one for key 'second_h1'.
like image 855
David Simka Avatar asked Jun 16 '17 18:06

David Simka


People also ask

Can you use == on dictionaries Python?

According to the python doc, you can indeed use the == operator on dictionaries.

How do you compare dictionaries in Python?

For simple dictionaries, comparing them is usually straightforward. You can use the == operator, and it will work.

What is the Python function that compares items of two dictionaries?

The compare method cmp() is used in Python to compare values and keys of two dictionaries. If method returns 0 if both dictionaries are equal, 1 if dic1 > dict2 and -1 if dict1 < dict2.

Is operator in dictionary Python?

Dictionaries have some of the same operators and built-in functions that can be used with strings, lists, and tuples. For example, the in and not in operators return True or False according to whether the specified operand occurs as a key in the dictionary.


2 Answers

first_h1 and second_h1 are Tag class instances. When you do my_dict[first_h1] or my_dict[second_h1], string representations of the tags are used for hashing. The problem is, both of these Tag instances have the same string representations:

<h1>some_header</h1>

This is because Tag class have __hash__() magic method defined as follows:

def __hash__(self):
    return str(self).__hash__()

One of the workarounds could be to use the id() values as hashes, but the there is a problem of redefining the Tag classes inside BeautifulSoup itself. You can workaround that problem by making your own custom "tag wrapper":

class TagWrapper:
    def __init__(self, tag):
        self.tag = tag

    def __hash__(self):
        return id(self.tag)

    def __str__(self):
        return str(self.tag)

    def __repr__(self):
        return str(self.tag)

Then, you'll be able to do:

In [1]: from bs4 import BeautifulSoup
   ...: 

In [2]: class TagWrapper:
   ...:     def __init__(self, tag):
   ...:         self.tag = tag
   ...: 
   ...:     def __hash__(self):
   ...:         return id(self.tag)
   ...: 
   ...:     def __str__(self):
   ...:         return str(self.tag)
   ...: 
   ...:     def __repr__(self):
   ...:         return str(self.tag)
   ...:     

In [3]: HTML_string = "<html><h1>some_header</h1><h1>some_header</h1></html>"
   ...: 
   ...: HTML_soup = BeautifulSoup(HTML_string, 'lxml')
   ...: 

In [4]: first_h1 = HTML_soup.find_all('h1')[0]      #first_h1 = <h1>some_header</h1>
   ...: second_h1 = HTML_soup.find_all('h1')[1]     #second_h1 = <h1>some_header</h1>
   ...: 

In [5]: my_dict = {}
   ...: my_dict[TagWrapper(first_h1)] = 1
   ...: my_dict[TagWrapper(second_h1)] = 1
   ...: 
   ...: print(my_dict)
   ...: 
{<h1>some_header</h1>: 1, <h1>some_header</h1>: 1}

It is, though, not pretty and not quite convenient to use. I would reiterate over your initial problem and check if you actually need to put tags into a dictionary.

You can also monkey-patch bs4 using Python's introspection powers, like it was done here, but this is going to be entering a rather dangerous territory.

like image 113
alecxe Avatar answered Sep 27 '22 22:09

alecxe


It seems you want to override the operator ==, you can choose the option of building a new class and implement the operator ==:

def  __eq__(self,  obj) :
      return (self is obj) 
like image 36
Gefen Morami Avatar answered Sep 27 '22 22:09

Gefen Morami