I am working with the web scraping framework Scrapy and I am wondering how do I iterate over all of the scraped items which seem to be in a dictionary and strip the white space from each one.
Here is the code I have been playing with in my item pipeline:
for info in item:
info[info].lstrip()
But this code does not work, because I cannot select items individually. So I tried to do this:
for key, value item.items():
value[1].lstrip()
This second method works to a degree, but the problem is that I have no idea how then to loop over all of the values.
I know this is probably such an easy fix, but I cannot seem to find it.
To remove a key from a dictionary in Python, use the pop() method or the “del” keyword.
Let's see how to remove spaces from dictionary keys in Python. Method #1: Using translate() function here we visit each key one by one and remove space with the none. Here translate function takes parameter 32, none where 32 is ASCII value of space ' ' and replaces it with none.
First, you need to convert the dictionary keys to a list using the list(dict. keys()) method. During each iteration, you can check if the value of a key is equal to the desired value. If it is True , you can issue the del statement to delete the key.
Python Dictionary clear() MethodThe clear() method removes all the elements from a dictionary.
In a dictionary comprehension (available in Python >=2.7):
clean_d = { k:v.strip() for k, v in d.iteritems()}
Python 3.X:
clean_d = { k:v.strip() for k, v in d.items()}
Not a direct answer to the question, but I would suggest you look at Item Loaders and input/output processors. A lot of your cleanup can be take care of here.
An example which strips each entry would be:
class ItemLoader(ItemLoader):
default_output_processor = MapCompose(unicode.strip)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With