Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read a json-dictionary type file with pandas?

Tags:

I have a long json like this: http://pastebin.com/gzhHEYGy

I would like to place it into a pandas datframe in order to play with it, so by the documentation I do the following:

df = pd.read_json('/user/file.json')
print df

I got this traceback:

  File "/Users/user/PycharmProjects/PAN-pruebas/json_2_dataframe.py", line 6, in <module>
    df = pd.read_json('/Users/user/Downloads/54db3923f033e1dd6a82222aa2604ab9.json')
  File "/usr/local/lib/python2.7/site-packages/pandas/io/json.py", line 198, in read_json
    date_unit).parse()
  File "/usr/local/lib/python2.7/site-packages/pandas/io/json.py", line 266, in parse
    self._parse_no_numpy()
  File "/usr/local/lib/python2.7/site-packages/pandas/io/json.py", line 483, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 203, in __init__
    mgr = self._init_dict(data, index, columns, dtype=dtype)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 327, in _init_dict
    dtype=dtype)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 4620, in _arrays_to_mgr
    index = extract_index(arrays)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 4668, in extract_index
    raise ValueError('arrays must all be same length')
ValueError: arrays must all be same length

Then from a previous question I found that I need to do something like this:

d = dict( A = np.array([1,2]), B = np.array([1,2,3,4]) )

But I dont get how should I obtain the contents like a numpy array. How can I preserve the length of the arrays in a big file like this?. Thanks in advance.

like image 572
skwoi Avatar asked Feb 06 '15 19:02

skwoi


People also ask

How do I read a JSON file in pandas?

Reading JSON Files using Pandas To read the files, we use read_json() function and through it, we pass the path to the JSON file we want to read. Once we do that, it returns a “DataFrame”( A table of rows and columns) that stores data.

Can data in a JSON file format be imported into pandas DataFrame?

Pandas Load JSON into the DataFrameStep 1: You need to create a JSON file that contains JSON strings. Step 2: Save the file with extension . json to create a JSON file. Step 3: Load the JSON file in Pandas using the command below.


2 Answers

the pandas module and not the json module should be the answer: pandas itself has read_json capabilities and the root of the problem must be that you did not read the json in the correct orientation. you must pass the exact orient parameter with which you created the json variable in the first place

ex.:

df_json = globals()['df'].to_json(orient='split')

and then:

read_to_json = pd.read_json(df_json, orient='split')
like image 28
Vaidøtas I. Avatar answered Sep 28 '22 10:09

Vaidøtas I.


The json method doesnt work as the json file is not in the format it expects. As we can easily load a json as a dict, let's try this way :

import pandas as pd
import json
import os

os.chdir('/Users/nicolas/Downloads')

# Reading the json as a dict
with open('json_example.json') as json_data:
    data = json.load(json_data)

# using the from_dict load function. Note that the 'orient' parameter 
#is not using the default value (or it will give the same error that you got before)
# We transpose the resulting df and set index column as its index to get this result
pd.DataFrame.from_dict(data, orient='index').T.set_index('index')   

output:

                                                                 data columns
index                                                                        
311210177061863424  [25-34\n, FEMALE, @bikewa absolutely the best....     age
310912785183813632  [25-34\n, FEMALE, Photo: I love the Burke-Gilm...  gender
311290293871849472  [25-34\n, FEMALE, Photo: Inhaled! #fitfoodie h...    text
309386414548717569  [25-34\n, FEMALE, Facebook Is Making The Most ...    None
312327801187495936  [25-34\n, FEMALE, Still upset about this &gt;&...    None
312249421079400449  [25-34\n, FEMALE, @JoeM_PM_UK @JonAntoine I've...    None
308692673194246145  [25-34\n, FEMALE, @Social_Freedom_ actually, t...    None
308995226633129984  [25-34\n, FEMALE, @seattleweekly that's more t...    None
308660851219501056  [25-34\n, FEMALE, @adamholdenbache I noticed 1...    None
308658690528014337  [25-34\n, FEMALE, @CEM_Social I am waiting pat...    None
309719798001070080  [25-34\n, FEMALE, Going to be watching Faceboo...    None
312349448049152002  [25-34\n, FEMALE, @anikamarketer I applied for...    None
312325152698404864  [25-34\n, FEMALE, @_chrisrojas_ wow, that's so...    None
310546490844135425  [25-34\n, FEMALE, Photo: Feeling like a bit of...    None
like image 143
knightofni Avatar answered Sep 28 '22 10:09

knightofni