<p>Let's say I have two objects of a same class: objA and objB. Their relationship is the following:</p> <pre class="prettyprint"><code>(objA == objB) #true (objA is objB) #false </code></pre> <p>If I use both objects as keys in a Python dict, then they will be considered as the same key, and overwrite each other. Is there a way to override the dict comparator to use the <code>is</code> comparison instead of <code>==</code> so that the two objects will be seen as different keys in the dict?</p> <p>Maybe I can override the equals method in the class or something? To be more specific, I am talking about two Tag objects from the BeautifulSoup4 library.</p> <p>Here's a more specific example of what I am talking about:</p> <pre class="prettyprint"><code>from bs4 import BeautifulSoup HTML_string = "<html><h1>some_header</h1><h1>some_header</h1></html>" HTML_soup = BeautifulSoup(HTML_string, 'lxml') first_h1 = HTML_soup.find_all('h1')[0] #first_h1 = <h1>some_header</h1> second_h1 = HTML_soup.find_all('h1')[1] #second_h1 = <h1>some_header</h1> print(first_h1 == second_h1) # this prints True print(first_h1 is second_h1) # this prints False my_dict = {} my_dict[first_h1] = 1 my_dict[second_h1] = 1 print(len(my_dict)) # my dict has only 1 entry! # I want to have 2 entries in my_dict: one for key 'first_h1', one for key 'second_h1'. </code></pre>

<p><code>first_h1</code> and <code>second_h1</code> are <em><code>Tag</code> class instances</em>. When you do <code>my_dict[first_h1]</code> or <code>my_dict[second_h1]</code>, <em>string representations</em> of the tags are used for hashing. The problem is, both of these <code>Tag</code> instances have the same string representations:</p> <pre class="prettyprint"><code><h1>some_header</h1> </code></pre> <p>This is because <code>Tag</code> class have <code>__hash__()</code> magic method defined as follows:</p> <pre class="prettyprint"><code>def __hash__(self): return str(self).__hash__() </code></pre> <p>One of the workarounds could be to use the <code>id()</code> values as hashes, but the there is a problem of redefining the <code>Tag</code> classes inside <code>BeautifulSoup</code> itself. You can workaround that problem by making your own custom "tag wrapper":</p> <pre class="prettyprint"><code>class TagWrapper: def __init__(self, tag): self.tag = tag def __hash__(self): return id(self.tag) def __str__(self): return str(self.tag) def __repr__(self): return str(self.tag) </code></pre> <p>Then, you'll be able to do:</p> <pre class="prettyprint"><code>In [1]: from bs4 import BeautifulSoup ...: In [2]: class TagWrapper: ...: def __init__(self, tag): ...: self.tag = tag ...: ...: def __hash__(self): ...: return id(self.tag) ...: ...: def __str__(self): ...: return str(self.tag) ...: ...: def __repr__(self): ...: return str(self.tag) ...: In [3]: HTML_string = "<html><h1>some_header</h1><h1>some_header</h1></html>" ...: ...: HTML_soup = BeautifulSoup(HTML_string, 'lxml') ...: In [4]: first_h1 = HTML_soup.find_all('h1')[0] #first_h1 = <h1>some_header</h1> ...: second_h1 = HTML_soup.find_all('h1')[1] #second_h1 = <h1>some_header</h1> ...: In [5]: my_dict = {} ...: my_dict[TagWrapper(first_h1)] = 1 ...: my_dict[TagWrapper(second_h1)] = 1 ...: ...: print(my_dict) ...: {<h1>some_header</h1>: 1, <h1>some_header</h1>: 1} </code></pre> <p>It is, though, not pretty and not quite convenient to use. I would reiterate over your initial problem and check if you actually need to put tags into a dictionary.</p> <p>You can also monkey-patch <code>bs4</code> using Python's introspection powers, like it was done here, but this is going to be entering a rather dangerous territory.</p>

Can I change the way keys are compared in a Python dict? I want to use the operator 'is' instead of ==

Q: Is operator in dictionary Python?

Dictionaries have some of the same operators and built-in functions that can be used with strings, lists, and tuples. For example, the in and not in operators return True or False according to whether the specified operand occurs as a key in the dictionary.

Let's say I have two objects of a same class: objA and objB. Their relationship is the following:

(objA == objB)    #true
(objA is objB)    #false

If I use both objects as keys in a Python dict, then they will be considered as the same key, and overwrite each other. Is there a way to override the dict comparator to use the is comparison instead of == so that the two objects will be seen as different keys in the dict?

Maybe I can override the equals method in the class or something? To be more specific, I am talking about two Tag objects from the BeautifulSoup4 library.

Here's a more specific example of what I am talking about:

from bs4 import BeautifulSoup

HTML_string = "<html><h1>some_header</h1><h1>some_header</h1></html>"

HTML_soup = BeautifulSoup(HTML_string, 'lxml')

first_h1 = HTML_soup.find_all('h1')[0]      #first_h1 = <h1>some_header</h1>
second_h1 = HTML_soup.find_all('h1')[1]     #second_h1 = <h1>some_header</h1>

print(first_h1 == second_h1)        # this prints True
print(first_h1 is second_h1)        # this prints False

my_dict = {}
my_dict[first_h1] = 1
my_dict[second_h1] = 1

print(len(my_dict))                 # my dict has only 1 entry!

# I want to have 2 entries in my_dict: one for key 'first_h1', one for key 'second_h1'.

Can you use == on dictionaries Python?

According to the python doc, you can indeed use the == operator on dictionaries.

How do you compare dictionaries in Python?

For simple dictionaries, comparing them is usually straightforward. You can use the == operator, and it will work.

What is the Python function that compares items of two dictionaries?

The compare method cmp() is used in Python to compare values and keys of two dictionaries. If method returns 0 if both dictionaries are equal, 1 if dic1 > dict2 and -1 if dict1 < dict2.

Is operator in dictionary Python?

Dictionaries have some of the same operators and built-in functions that can be used with strings, lists, and tuples. For example, the in and not in operators return True or False according to whether the specified operand occurs as a key in the dictionary.

first_h1 and second_h1 are Tag class instances. When you do my_dict[first_h1] or my_dict[second_h1], string representations of the tags are used for hashing. The problem is, both of these Tag instances have the same string representations:

<h1>some_header</h1>

This is because Tag class have __hash__() magic method defined as follows:

def __hash__(self):
    return str(self).__hash__()

One of the workarounds could be to use the id() values as hashes, but the there is a problem of redefining the Tag classes inside BeautifulSoup itself. You can workaround that problem by making your own custom "tag wrapper":

class TagWrapper:
    def __init__(self, tag):
        self.tag = tag

    def __hash__(self):
        return id(self.tag)

    def __str__(self):
        return str(self.tag)

    def __repr__(self):
        return str(self.tag)

Then, you'll be able to do:

In [1]: from bs4 import BeautifulSoup
   ...: 

In [2]: class TagWrapper:
   ...:     def __init__(self, tag):
   ...:         self.tag = tag
   ...: 
   ...:     def __hash__(self):
   ...:         return id(self.tag)
   ...: 
   ...:     def __str__(self):
   ...:         return str(self.tag)
   ...: 
   ...:     def __repr__(self):
   ...:         return str(self.tag)
   ...:     

In [3]: HTML_string = "<html><h1>some_header</h1><h1>some_header</h1></html>"
   ...: 
   ...: HTML_soup = BeautifulSoup(HTML_string, 'lxml')
   ...: 

In [4]: first_h1 = HTML_soup.find_all('h1')[0]      #first_h1 = <h1>some_header</h1>
   ...: second_h1 = HTML_soup.find_all('h1')[1]     #second_h1 = <h1>some_header</h1>
   ...: 

In [5]: my_dict = {}
   ...: my_dict[TagWrapper(first_h1)] = 1
   ...: my_dict[TagWrapper(second_h1)] = 1
   ...: 
   ...: print(my_dict)
   ...: 
{<h1>some_header</h1>: 1, <h1>some_header</h1>: 1}

It is, though, not pretty and not quite convenient to use. I would reiterate over your initial problem and check if you actually need to put tags into a dictionary.

You can also monkey-patch bs4 using Python's introspection powers, like it was done here, but this is going to be entering a rather dangerous territory.

It seems you want to override the operator ==, you can choose the option of building a new class and implement the operator ==:

def  __eq__(self,  obj) :
      return (self is obj)

Can I change the way keys are compared in a Python dict? I want to use the operator 'is' instead of ==

Tags:

python

dictionary

beautifulsoup

equals

David Simka

People also ask

2 Answers

alecxe

Gefen Morami

Recent Activity

Donate For Us

Can I change the way keys are compared in a Python dict? I want to use the operator 'is' instead of ==

Tags:

python

dictionary

beautifulsoup

equals

David Simka

People also ask

2 Answers

alecxe

Gefen Morami

Related questions

Recent Activity

Donate For Us