Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas.read_json(JSON_URL)

I am using Pandas to get data from an API. The API returns data in JSON format. However the json has some values that I don't want in the dataframe. Because of these values, I am not able to assign an index to data frame. Following is the format.

{
"Success": true,
"message": "",
"result": [{"id":12312312, "TimeStamp":"2017-10-04T17:39:53.92","Quantity":3.03046306,},{"id": 2342344, "TimeStamp":"2017-10-04T17:39:53.92","Quantity":3.03046306,}]}

I am only interested in the "result" part. One way to do this is to import json with request.get(request_URL) and then after extracting the "result" part, convert the result into the dataframe. 2nd way can be to import the data with Pandas.read_json(JSON_URL) convert the returning dataframe back to a json, then after extracting "result" part, convert the result into the dataframe.

Is there any other way to do this? What is the best approach and why? Thanks.

like image 434
Sarfraz Avatar asked Oct 05 '17 04:10

Sarfraz


People also ask

What does PD read_json do?

read_json. Convert a JSON string to pandas object. Any valid string path is acceptable.

Can Panda read JSON file?

Reading JSON Files using PandasTo read the files, we use read_json() function and through it, we pass the path to the JSON file we want to read. Once we do that, it returns a “DataFrame”( A table of rows and columns) that stores data.

How read JSON string in pandas?

If you have a JSON in a string, you can read or load this into pandas DataFrame using read_json() function. By default, JSON string should be in Dict like format {column -> {index -> value}} . This is also called column orientation. Note that orient param is used to specify the JSON string format.

Can pandas handle JSON?

JSON is plain text, but has the format of an object, and is well known in the world of programming, including Pandas. In our examples we will be using a JSON file called 'data. json'.


2 Answers

Use json_normalize:

import pandas as pd

df = pd.json_normalize(json['result'])
print (df)

   Quantity               TimeStamp        id
0  3.030463  2017-10-04T17:39:53.92  12312312
1  3.030463  2017-10-04T17:39:53.92   2342344

Also here working:

df = pd.DataFrame(d['result'])
print (df)
   Quantity               TimeStamp        id
0  3.030463  2017-10-04T17:39:53.92  12312312
1  3.030463  2017-10-04T17:39:53.92   2342344

For DatetimeIndex convert column to_datetime and set_index:

df['TimeStamp'] = pd.to_datetime(df['TimeStamp'])
df = df.set_index('TimeStamp')
print (df)

                         Quantity        id
TimeStamp                                  
2017-10-04 17:39:53.920  3.030463  12312312
2017-10-04 17:39:53.920  3.030463   2342344

EDIT:

Solution with load data:

from urllib.request import urlopen
import json
import pandas as pd

response = urlopen("https://bittrex.com/api/v1.1/public/getmarkethistory?market=BTC-ETC")
json_data = response.read().decode('utf-8', 'replace')

d = json.loads(json_data)
df = pd.json_normalize(d['result'])
df['TimeStamp'] = pd.to_datetime(df['TimeStamp'])
df = df.set_index('TimeStamp')

print (df.head())
                          Quantity     Total  
TimeStamp                                     
2017-10-05 06:05:06.510   3.579201  0.010000  
2017-10-05 06:04:34.060  45.614760  0.127444  
2017-10-05 06:04:34.060   5.649898  0.015785  
2017-10-05 06:04:34.060   1.769847  0.004945  
2017-10-05 06:02:25.063   0.250000  0.000698  

Another solution:

df = pd.read_json('https://bittrex.com/api/v1.1/public/getmarkethistory?market=BTC-ETC')
df = pd.DataFrame(df['result'].values.tolist())
df['TimeStamp'] = pd.to_datetime(df['TimeStamp'])
df = df.set_index('TimeStamp')
print (df.head())

                          Quantity     Total  
TimeStamp                                     
2017-10-05 06:11:25.100   5.620957  0.015704  
2017-10-05 06:11:11.427  22.853546  0.063851  
2017-10-05 06:10:30.600   6.999213  0.019555  
2017-10-05 06:10:29.163  20.000000  0.055878  
2017-10-05 06:10:29.163   0.806039  0.002252  
like image 154
jezrael Avatar answered Oct 29 '22 12:10

jezrael


Another solution, based on jezrael's using requests:

import requests
import pandas as pd

d = requests.get("https://bittrex.com/api/v1.1/public/getmarkethistory?market=BTC-ETC").json()
df = pd.DataFrame.from_dict(d['result'])
df['TimeStamp'] = pd.to_datetime(df['TimeStamp'])
df = df.set_index('TimeStamp')

df
like image 21
Anton vBR Avatar answered Oct 29 '22 10:10

Anton vBR