I'm messing around learning to work with APIs, I figured I'd make a Reddit bot. I'm trying to apply some code I used for a different script. That script used requests turned the request to json then added it a pandas dataframe and then wrote a csv.
I'm trying to do so about the same but don't know how to run the Reddit data into the dataframe. What I've tried below throws errors.
#!/usr/bin/python
import praw
import pandas as pd
reddit = praw.Reddit('my_bot')
subreddit = reddit.subreddit("askreddit")
for submission in subreddit.hot(limit=5):
print("Title: ", submission.title)
print("Score: ", submission.score)
print("Link: ", submission.url)
print("---------------------------------\n")
csv_file = f"/home/robothead/scripts/python/reddit/reddit-data.csv"
# start with empty dataframe
df = pd.DataFrame()
#j_data = subreddit.json()
#parse_data = j_data['data']
# append to the dataframe
#df = df.append(pd.DataFrame.from_dict(pd.json_normalize(parse_data), orient='columns'))
# append to the dataframe
df = df.append(pd.DataFrame.from_dict(pd(submission), orient='columns'))
# write the whole CSV at once
df.to_csv(csv_file, index=False, encoding='utf-8')
error:
Traceback (most recent call last):
File "bot.py", line 21, in <module>
df = df.append(pd.DataFrame.from_dict(pd(submission), orient='columns'))
TypeError: 'module' object is not callable
This is how I've done it in the past:
df = pd.DataFrame([ vars(post) for post in subreddit.hot(limit=5) ])
vars converts praw.Submission to a dict and pandas DataFrame constructor can take a list of dictionaries. Works well if you have dicts with the same keys, which is the case here. Of course you get a giant dataframe with ALL the columns. Some even have praw objects in them (that you can work with!). You'll probably want to parse that down by just keeping the columns you want before writing to a file.
Edit:
Just so there's no confusion, here is the full script example:
#!/usr/bin/python
import praw
import pandas as pd
reddit = praw.Reddit('my_bot')
subreddit = reddit.subreddit("askreddit")
df = pd.DataFrame([ vars(post) for post in subreddit.hot(limit=5) ])
df = df[["title","score","url"]]
df.to_csv(csv_file, index=False, encoding='utf-8')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With