Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a list of dictionaries results in a list of copies of the same dictionary

I want to get all the iframe from a webpage.

Code:

site = "http://" + url f = urllib2.urlopen(site) web_content =  f.read()  soup = BeautifulSoup(web_content) info = {} content = [] for iframe in soup.find_all('iframe'):     info['src'] = iframe.get('src')     info['height'] = iframe.get('height')     info['width'] = iframe.get('width')     content.append(info)     print(info)         pprint(content) 

result of print(info):

{'src': u'abc.com', 'width': u'0', 'height': u'0'} {'src': u'xyz.com', 'width': u'0', 'height': u'0'} {'src': u'http://www.detik.com', 'width': u'1000', 'height': u'600'} 

result of pprint(content):

[{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'}, {'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'}, {'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'}] 

Why is the value of the content not right? It's suppose to be the same as the value when I print(info).

like image 236
l1th1um Avatar asked Jul 15 '12 14:07

l1th1um


People also ask

How do you create a dictionary in a list?

To convert a list to a dictionary using the same values, you can use the dict. fromkeys() method. To convert two lists into one dictionary, you can use the Python zip() function. The dictionary comprehension lets you create a new dictionary based on the values of a list.

How do I sort a list of dictionaries by a value of the dictionary?

To sort a list of dictionaries according to the value of the specific key, specify the key parameter of the sort() method or the sorted() function. By specifying a function to be applied to each element of the list, it is sorted according to the result of that function.


2 Answers

You are not creating a separate dictionary for each iframe, you just keep modifying the same dictionary over and over, and you keep adding additional references to that dictionary in your list.

Remember, when you do something like content.append(info), you aren't making a copy of the data, you are simply appending a reference to the data.

You need to create a new dictionary for each iframe.

for iframe in soup.find_all('iframe'):     info = {}     ... 

Even better, you don't need to create an empty dictionary first. Just create it all at once:

for iframe in soup.find_all('iframe'):     info = {         "src": iframe.get('src'),         "height": iframe.get('height'),         "width": iframe.get('width'),     }     content.append(info) 

There are other ways to accomplish this, such as iterating over a list of attributes, or using list or dictionary comprehensions, but it's hard to improve upon the clarity of the above code.

like image 191
Bryan Oakley Avatar answered Oct 14 '22 13:10

Bryan Oakley


You have misunderstood the Python list object. It is similar to a C pointer-array. It does not actually "copy" the object which you append to it. Instead, it just store a "pointer" to that object.

Try the following code:

>>> d={} >>> dlist=[] >>> for i in xrange(0,3):     d['data']=i     dlist.append(d)     print(d)  {'data': 0} {'data': 1} {'data': 2} >>> print(dlist) [{'data': 2}, {'data': 2}, {'data': 2}] 

So why is print(dlist) not the same as print(d)?

The following code shows you the reason:

>>> for i in dlist:     print "the list item point to object:", id(i)  the list item point to object: 47472232 the list item point to object: 47472232 the list item point to object: 47472232 

So you can see all the items in the dlist is actually pointing to the same dict object.

The real answer to this question will be to append the "copy" of the target item, by using d.copy().

>>> dlist=[] >>> for i in xrange(0,3):     d['data']=i     dlist.append(d.copy())     print(d)  {'data': 0} {'data': 1} {'data': 2} >>> print dlist [{'data': 0}, {'data': 1}, {'data': 2}] 

Try the id() trick, you can see the list items actually point to completely different objects.

>>> for i in dlist:     print "the list item points to object:", id(i)  the list item points to object: 33861576 the list item points to object: 47472520 the list item points to object: 47458120 
like image 25
Wang Avatar answered Oct 14 '22 14:10

Wang