Stripping Line Breaks in Tweets via Tweepy

Question

I'm looking pull data from the Twitter API and create a pipe separated file that I can do further processing on. My code currently looks like this:

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)

out_file = "tweets.txt"

tweets = api.search(q='foo')
o = open(out_file, 'a')

for tweet in tweets:
        id = str(tweet.id)
        user = tweet.user.screen_name
        post = tweet.text
        post = post.encode('ascii', 'ignore')
        post = post.strip('|') # so pipes in tweets don't create unwanted separators
        post = post.strip('
')
        record = id + "|" + user + "|" + post
        print>>o, record

I have a problem when a user's tweet includes line breaks which makes the output data look like this:

473565810326601730|usera|this is a tweet 
473565810325865901|userb|some other example 
406478015419876422|userc|line 
separated 
tweet
431658790543289758|userd|one more tweet

I want to strip out the line breaks on the third tweet. I've tried post.strip(' ') and post.strip('0x0D 0x0A') in addition to the above but none seem to work. Any ideas?

Juan E. · Accepted Answer

That is because strip returns "a copy of the string with leading and trailing characters removed".

You should use replace for the new line and for the pipe:

post = post.replace('|', ' ')
post = post.replace('
', ' ')

Stripping Line Breaks in Tweets via Tweepy

Tags:

python

twitter

tweepy

Kevin

1 Answers

Juan E.

Recent Activity

Donate For Us

Stripping Line Breaks in Tweets via Tweepy

Tags:

python

twitter

tweepy

Kevin

1 Answers

Juan E.

Related questions

Recent Activity

Donate For Us