Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

output a dataframe to a json array

I was wondering if there was a more efficient way to do the following operation.

# transforms datetime into timestamp in seconds
t = df.index.values.astype(np.int64) // 10**6

return jsonify(np.c_[t, df.open, df.high, df.low, df.close, df.volume].tolist())

where df is a dataframe containing an index that is a date, and at least (but not only) the following attributes: open, high, low, close, volume. I then output the newly created array as JSON with flask's jsonify. The code above works but it looks pretty inefficient to me any idea on how to make it nicer/more efficient.

like image 457
John Avatar asked Jul 31 '16 20:07

John


People also ask

How do I convert a DataFrame to JSON?

You can convert JSON to Pandas DataFrame by simply using read_json() . Just pass JSON string to the function. It takes multiple parameters, for our case I am using orient that specifies the format of JSON string. This function is also used to read JSON files into pandas DataFrame.

How do you convert a DataFrame to a JSON string in Python?

You can convert pandas DataFrame to JSON string by using DataFrame. to_json() method. This method takes a very important param orient which accepts values ' columns ', ' records ', ' index ', ' split ', ' table ', and ' values '.

Can you store DataFrame in a list?

Pandas DataFrame can be converted into lists in multiple ways. Let's have a look at different ways of converting a DataFrame one by one. Method #1: Converting a DataFrame to List containing all the rows of a particular column: Python3.

What is Orient in JSON?

orient: strIndication of expected JSON string format. Compatible JSON strings can be produced by to_json() with a corresponding orient value. The set of possible orients is: 'split' : dict like {index -> [index], columns -> [columns], data -> [values]} 'records' : list like [{column -> value}, ... , {column -> value}]


1 Answers

you can use to_json() method:

In [88]: import pandas_datareader.data as web

In [89]: apl = web.get_data_yahoo('AAPL', '2016-07-05', '2016-07-07')

In [90]: apl
Out[90]:
                 Open       High        Low      Close    Volume  Adj Close
Date
2016-07-05  95.389999  95.400002  94.459999  94.989998  27705200  94.989998
2016-07-06  94.599998  95.660004  94.370003  95.529999  30949100  95.529999
2016-07-07  95.699997  96.500000  95.620003  95.940002  25139600  95.940002

I'll use json.dumps(..., indent=2) in order to make it nicer/readable:

In [91]: import json

orient='index'

In [98]: print(json.dumps(json.loads(apl.to_json(orient='index')), indent=2))
{
  "1467849600000": {
    "Close": 95.940002,
    "High": 96.5,
    "Open": 95.699997,
    "Adj Close": 95.940002,
    "Volume": 25139600,
    "Low": 95.620003
  },
  "1467676800000": {
    "Close": 94.989998,
    "High": 95.400002,
    "Open": 95.389999,
    "Adj Close": 94.989998,
    "Volume": 27705200,
    "Low": 94.459999
  },
  "1467763200000": {
    "Close": 95.529999,
    "High": 95.660004,
    "Open": 94.599998,
    "Adj Close": 95.529999,
    "Volume": 30949100,
    "Low": 94.370003
  }
}

orient='records' (reset index in order to make column Date visible):

In [99]: print(json.dumps(json.loads(apl.reset_index().to_json(orient='records')), indent=2))
[
  {
    "Close": 94.989998,
    "High": 95.400002,
    "Open": 95.389999,
    "Adj Close": 94.989998,
    "Volume": 27705200,
    "Date": 1467676800000,
    "Low": 94.459999
  },
  {
    "Close": 95.529999,
    "High": 95.660004,
    "Open": 94.599998,
    "Adj Close": 95.529999,
    "Volume": 30949100,
    "Date": 1467763200000,
    "Low": 94.370003
  },
  {
    "Close": 95.940002,
    "High": 96.5,
    "Open": 95.699997,
    "Adj Close": 95.940002,
    "Volume": 25139600,
    "Date": 1467849600000,
    "Low": 95.620003
  }
]

you can make use of the following to_json() parameters:

date_format : {‘epoch’, ‘iso’}

Type of date conversion. epoch = epoch milliseconds, iso` = ISO8601, default is epoch.

date_unit : string, default ‘ms’ (milliseconds)

The time unit to encode to, governs timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’, ‘ns’ for second, millisecond, microsecond, and nanosecond respectively.

orient : string

The format of the JSON string

  • split : dict like {index -> [index], columns -> [columns], data -> [values]}
  • records : list like [{column -> value}, ... , {column -> value}]
  • index : dict like {index -> {column -> value}}
  • columns : dict like {column -> {index -> value}} values : just the values array
like image 162
MaxU - stop WAR against UA Avatar answered Sep 21 '22 13:09

MaxU - stop WAR against UA