Using python and twitter api to get tweet objects.
I have a file (tweetfile = a .txt file on my computer) with tweets and I'm trying to loop through the objects to get the text. I checked the twitter object with tweetObj.keys() to see the keys and 'text' is there; however, when I try to get the individual text using tweetObj['text'] I get the KeyError: 'text'
code:
for line in tweetfile:
tweetObj = json.loads(line)
keys = tweetObj.keys()
print keys
tweet = tweetObj['text']
print tweet
below is the output:
[u'contributors', u'truncated', u'text', u'in_reply_to_status_id', u'id', u'favorite_count', u'source', u'retweeted', u'coordinates', u'entities', u'in_reply_to_screen_name', u'id_str', u'retweet_count', u'in_reply_to_user_id', u'favorited', u'user', u'geo', u'in_reply_to_user_id_str', u'possibly_sensitive', u'lang', u'created_at', u'filter_level', u'in_reply_to_status_id_str', u'place']
@awe5sauce my dad was like "so u wanna be in a relationship with a 'big dumb idiot'" nd i was like yah shes the bae u feel lmao
[u'delete']
Traceback (most recent call last):
File "C:\apps\droid\a1\tweets.py", line 34, in <module>
main()
File "C:\apps\droid\a1\tweets.py", line 28, in main
tweet = tweetObj['text']
KeyError: 'text'
I'm not sure how to approach since it looks like it prints one tweet. The question is why would this occur where the key exists and appears to return a value but not for all instances and how can I correct it to where I can access the value for all lines with that key?
There are 2 dictionaries created within the loop, one for each line. The first one has text
and the second one only has a 'delete'
key. It does not have the 'text'
key. Hence the error message.
Change it to:
for line in tweetfile:
tweetObj = json.loads(line)
keys = tweetObj.keys()
print keys
if 'text' in tweetObj:
print tweetObj['text']
else:
print 'This does not have a text entry'
Just so you know, if you are only interested in the lines containing text
, you may want to use
[ json.loads(l)['text'] for l in tweetfile if 'text' in json.loads(l) ]
or
'\n'.join([ json.loads(l)['text'] for l in tweetfile if 'text' in json.loads(l) ])
or even BETTER
[ json.loads(l).get('text') for l in tweetfile]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With