Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert JSON data from Request into Pandas DataFrame

I'm trying to scrape some data from a web page and put it into a pandas dataframe. I tried and read many things but I just cannot get what I want. And I want a dataframe with all the data in separate columns and rows. Below is my code.

import requests
import json
import pandas as pd
from pandas.io.json import json_normalize

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

a = json.loads(r.text)

res = json_normalize(a)
##print(res)

df = pd.DataFrame(res)
print(df)

##df = pd.read_json(a)
##print(df)

pd.read_json(a) doesn't seem to work in any way. Could someone give it a try?

Thanks for all the help in advance.

Best regards, David

like image 363
DavidV Avatar asked Feb 28 '17 21:02

DavidV


4 Answers

Or, more simply:

import requests
import pandas as pd

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

j = r.json()

df = pd.DataFrame.from_dict(j)
like image 70
Justin Eyster Avatar answered Nov 03 '22 23:11

Justin Eyster


you can do it this way:

import requests
import pandas as pd

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

j = r.json()

df = pd.DataFrame([[d['v'] for d in x['c']] for x in j['rows']],
                  columns=[d['label'] for d in j['cols']])

Result:

In [217]: df
Out[217]:
                   Country  Weight  CAPE    PE    PC   PB   PS   DY  RS 26W  RS 52W  Score
0                   Russia     1.1   5.9   9.1   5.1  1.0  0.9  3.7    1.22    1.35    1.0
1                    China     1.1  12.8   7.2   4.5  0.9  0.6  4.2    1.05    1.13    2.0
2                    Italy     1.0  12.7  31.5   5.7  1.2  0.6  3.3    1.13    1.11    3.0
3                  Austria     0.2  14.3  21.7   7.3  1.1  0.7  2.5    1.10    1.15    4.0
4                   Norway     0.4  12.8  32.4   7.4  1.6  1.2  4.0    1.10    1.17    5.0
5                  Hungary     0.0  12.5  49.8   7.5  1.4  0.7  2.3    1.12    1.19    6.0
6                    Spain     1.2  11.7  24.7   7.0  1.4  1.2  3.7    1.08    1.11    7.0
7                    Czech     0.0   8.9  13.6   6.1  1.3  1.0  6.7    1.03    1.05    8.0
8                   Brazil     1.3   9.8  42.1   7.4  1.6  1.2  3.0    1.06    1.24    9.0
9                 Portugal     0.1  11.3  29.0   4.8  1.5  0.7  3.9    1.05    1.06   10.0
..                     ...     ...   ...   ...   ...  ...  ...  ...     ...     ...    ...
42        EMERGING MARKETS    13.5  14.0  16.0   8.8  1.6  1.3  2.9    1.04    1.11    NaN
43        DEVELOPED EUROPE    22.4  16.6  26.5   9.9  1.8  1.1  3.2    1.06    1.08    NaN
44         EMERGING EUROPE     1.7   8.6  10.9   5.8  1.1  0.8  3.4    1.13    1.20    NaN
45        EMERGING AMERICA     3.0  15.2  30.1   9.4  1.9  1.2  2.4    1.03    1.11    NaN
46  DEVELOPED ASIA-PACIFIC    17.7   NaN  17.7   8.8  1.3  0.9  2.5    1.03    1.09    NaN
47   EMERGING ASIA-PACIFIC     6.9  14.9  15.1   9.1  1.8  1.4  2.7    1.01    1.08    NaN
48         EMERGING AFRICA     0.8   NaN  16.5  10.6  2.0  1.4  3.8    1.06    1.12    NaN
49             MIDDLE EAST     1.3   NaN  13.7  11.8  1.5  1.8  3.9    1.06    1.10    NaN
50                    BRIC     5.9  11.8  14.6   7.4  1.4  1.2  2.7    1.06    1.16    NaN
51     OTHER EMERGING MKT.     2.5   NaN  17.7  12.9  1.8  1.5  3.1    1.16    1.20    NaN

[52 rows x 11 columns]
like image 25
MaxU - stop WAR against UA Avatar answered Nov 03 '22 22:11

MaxU - stop WAR against UA


And one step simpler than Justin's (already helpful) response...by putting .json() at the end of the r = requests.get line

import requests
import pandas as pd

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php').json()

df = pd.DataFrame.from_dict(r)
like image 6
Jason Avatar answered Nov 03 '22 23:11

Jason


You may also want pd.json_normalize for when your data isn't exactly the way from_dict() expects.

For example:

data = [
    {
        "id": 1,
        "name": "Cole Volk",
        "fitness": {"height": 130, "weight": 60},
    },
    {"name": "Mark Reg", "fitness": {"height": 130, "weight": 60}},
    {
        "id": 2,
        "name": "Faye Raker",
        "fitness": {"height": 130, "weight": 60},
    },
]
pd.json_normalize(data, max_level=1)
    id        name  fitness.height  fitness.weight
0  1.0   Cole Volk             130              60
1  NaN    Mark Reg             130              60
2  2.0  Faye Raker             130              60
like image 2
Rob Rose Avatar answered Nov 03 '22 23:11

Rob Rose