TL;DR If loaded fields in a Pandas DataFrame contain JSON documents themselves, how can they be worked with in a Pandas like fashion?
Currently I'm directly dumping json/dictionary results from a Twitter library (twython) into a Mongo collection (called users here).
from twython import Twython
from pymongo import MongoClient
tw = Twython(...<auth>...)
# Using mongo as object storage
client = MongoClient()
db = client.twitter
user_coll = db.users
user_batch = ... # collection of user ids
user_dict_batch = tw.lookup_user(user_id=user_batch)
for user_dict in user_dict_batch:
if(user_coll.find_one({"id":user_dict['id']}) == None):
user_coll.insert(user_dict)
After populating this database I read the documents into Pandas:
# Pull straight from mongo to pandas
cursor = user_coll.find()
df = pandas.DataFrame(list(cursor))
Which works like magic:
I'd like to be able to mangle the 'status' field Pandas style (directly accessing attributes). Is there a way?
EDIT: Something like df['status:text']. Status has fields like 'text', 'created_at'. One option could be flattening/normalizing this json field like this pull request Wes McKinney was working on.
Reading JSON Files using Pandas To read the files, we use read_json() function and through it, we pass the path to the JSON file we want to read. Once we do that, it returns a “DataFrame”( A table of rows and columns) that stores data.
Python has built in functions that easily imports JSON files as a Python dictionary or a Pandas dataframe. Use pd. read_json() to load simple JSONs and pd. json_normalize() to load nested JSONs.
One solution is just to smash it with the Series constructor:
In [1]: df = pd.DataFrame([[1, {'a': 2}], [2, {'a': 1, 'b': 3}]])
In [2]: df
Out[2]:
0 1
0 1 {u'a': 2}
1 2 {u'a': 1, u'b': 3}
In [3]: df[1].apply(pd.Series)
Out[3]:
a b
0 2 NaN
1 1 3
In some cases you'll want to concat this to the DataFrame in place of the dict row:
In [4]: dict_col = df.pop(1) # here 1 is the column name
In [5]: pd.concat([df, dict_col.apply(pd.Series)], axis=1)
Out[5]:
0 a b
0 1 2 NaN
1 2 1 3
If the it goes deeper, you can do this a few times...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With