pandas

Question

I have this data:

data = pd.DataFrame().from_dict([r for r in response])
print data

     _id  total
0    213      1
1    194      3
2    205      156
...

Now, if I call:

data.hist()

I will get two separate histograms, one for each column. This is not what I want. What I want is a single histogram made using those two columns, where one column is interpreted as a value and another one as a number of occurrences of this value. What should I do to generate such a histogram?

I tried:

data.hist(column="_id", by="total")

But this generates even more (empty) histograms with error message.

Ami Tavory · Accepted Answer

You can always drop to the lower-level matplotlib.hist:

from matplotlib.pyplot import hist
df = pd.DataFrame({
    '_id': np.random.randn(100),
    'total': 100 * np.random.rand()
})
hist(df._id, weights=df.total)

enter image description here

dermen · Answer

Since you already have the bin frequencies computed (the total column), just use pandas.DataFrame.plot

data.plot( x='_id', y='total', kind='hist')

pandas - histogram from two columns?

Tags:

python

plot

histogram

mnowotka

2 Answers

Ami Tavory

dermen

Recent Activity

Donate For Us