Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I efficiently move from a Pandas dataframe to JSON

I've started using pandas to do some aggregation by date. My goal is to count all of the instances of a measurement that occur on a particular day, and to then represent this in D3. To illustrate my workflow, I have a queryset (from Django) that looks like this:

queryset = [{'created':"05-16-13", 'counter':1, 'id':13}, {'created':"05-16-13", 'counter':1, 'id':34}, {'created':"05-17-13", 'counter':1, 'id':12}, {'created':"05-16-13", 'counter':1, 'id':7}, {'created':"05-18-13", 'counter':1, 'id':6}]

I make a dataframe in pandas and aggregate the measure 'counter' by the day created:

import pandas as pd
queryset_df = pd.DataFrame.from_records(queryset).set_index('id')
aggregated_df = queryset_df.groupby('created').sum()

This gives me a dataframe like this:

          counter
created          
05-16-13        3
05-17-13        1
05-18-13        1

As I'm using D3 I thought that a JSON object would be the most useful. Using the Pandas to_json() function I convert my dataframe like this:

aggregated_df.to_json()

giving me the following JSON object

{"counter":{"05-16-13":3,"05-17-13":1,"05-18-13":1}}

This is not exactly what I want, as I would like to be able to access both the date, and the measurement. Is there a way that I can export the data such that I end up with something like this?

data = {"c1":{"date":"05-16-13", "counter":3},"c2":{"date":"05-17-13", "counter":1}, "c3":{"date":"05-18-13", "counter":1}}

I thought that if I could structure this differently on the Python side, it would reduce the amount of data formatting I would need to do on the JS side as I planned to load the data doing something like this:

  x.domain(d3.extent(data, function(d) { return d.date; }));
  y.domain(d3.extent(data, function(d) { return d.counter; }));

I'm very open to suggestions of better workflows overall as this is something I will need to do frequently but am unsure of the best way of handling the connection between D3 and pandas. (I have looked at several packages that combine both python and D3 directly, but that is not something that I am looking for as they seem to focus on static chart generation and not making an svg)

like image 227
djq Avatar asked Oct 06 '13 22:10

djq


1 Answers

Transform your date index back into a simple data column with reset_index, and then generate your json object by using the orient='index' property:

In [11]: aggregated_df.reset_index().to_json(orient='index')
Out[11]: '{"0":{"created":"05-16-13","counter":3},"1":{"created":"05-17-13","counter":1},"2":{"created":"05-18-13","counter":1}}'
like image 135
Zeugma Avatar answered Oct 20 '22 22:10

Zeugma