I am trying to iterate through a list of nested JSON objects (returned from the twitter rest API via tweepy.api.search) and delete certain objects. I have a list of objects to keep. I wish to specify which dictionary objects to keep rather than which to delete because different tweets have different keys. They all have some keys such as "text", "created_at", etc... but there are other keys that only certain tweets have.
I am running into two problems.
1) I cannot delete a dictionary item while iterating through the dictionary
2) Many of the dictionary objects contain nested lists and dictionaries which I am having trouble accessing
A small portion of the JSON file I'm iterating through:
{
"statuses": [
{
"contributors": null,
"coordinates": null,
"created_at": "Thu Nov 12 01:28:07 +0000 2015",
"entities": {
"hashtags": [],
"symbols": [],
"urls": [
{
"display_url": "twitter.com/thehill/status\u2026",
"expanded_url": "https://twitter.com/thehill/status/664581138975989761",
"indices": [
139,
140
],
"url": "https://t.co/9zfkg2FixZ"
}
],
"user_mentions": [
{
"id": 2517854953,
"id_str": "2517854953",
"indices": [
3,
19
],
"name": "It'sAlwaysPolitical",
"screen_name": "politicspodcast"
}
]
},
"favorite_count": 0,
"favorited": false,
"geo": null
}
]
}
Each item in the list "statuses" is one tweet, and there are 100 tweets returned per call.
List of items that I want to keep:
keepers_list = [tweetlist["statuses"][i]["coordinates"],
tweetlist["statuses"][i]["created_at"],
tweetlist["statuses"][i]["entities"]["urls"]
]
I am trying to do:
for item in tweetlist:
if item not in keepers_list:
del item
I have tried this exact code and more variations on it/different methods than I can recall, but cannot make it work. I have looked at numerous stack exchange posts on this topic, but have not been able to adapt any of them to my purpose.
I have tried using
for key in dict.iterkeys(): ...
for value in dict.itervalues(): ...
for key, value in dict.iteritems():
but I cannot make any of them work for what I want to do.
Any help, or just a push in the right direction would be greatly appreciated.
Never delete items in a list while iterating over it, you can either
Make a copy of the list to iterate over:
for item in tweetlist[:]:
...
Save your desired results in another list:
keep = []
for item in tweetlist:
if item in keepers_list:
keep.append(item)
My general rule of thumb in Python is, if I find myself using a loop, to search for a different approach. In this case, to use a dictionary comprehension, based on the original entry:
keep = {key:tweet_list[key] for key in tweet_list.keys() if key in keepers_list}
Unless the original dataset is so large that it has to be processed in place, a comprehension is generally fast and, if relatively short, self documenting enough to be easily understood.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With