How to make matplotlib/pandas bar chart look like hist chart?

Plotting Differences between `bar` and `hist`

Given some data in a pandas.Series , rv, there is a difference between

Calling hist directly on the data to plot
Calculating the histogram results (with numpy.histogram) then plotting with bar

Example Data Generation

%matplotlib inline

import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')

# Setup size and distribution
size = 50000
distribution = stats.norm()

# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)

# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)

# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)

`hist()` Plotting

ax = pdf.plot(lw=2, label='PDF', legend=True)
rv.plot(kind='hist', bins=50, normed=True, alpha=0.5, label='Random Samples', legend=True, ax=ax)

hist plotting

`bar()` Plotting

ax = pdf.plot(lw=2, label='PDF', legend=True)
hist.plot(kind='bar', alpha=0.5, label='Random Samples', legend=True, ax=ax)

bar plotting

How can the `bar` plot be made to look like the `hist` plot?

The use case for this is needing to save only the histogrammed data to use and plot later (it is typically smaller in size than the original data).

943

asked May 31 '16 14:05

tmthydvnprt

2 Answers

Bar plotting differences

Obtaining a bar plot that looks like the hist plot requires some manipulating of default behavior for bar.

Force bar to use actual x data for plotting range by passing both x (hist.index) and y (hist.values). The default bar behavior is to plot the y data against an arbitrary range and put the x data as the label.
Set the width parameter to be related to actual step size of x data (The default is 0.8)
Set the align parameter to 'center'.
Manually set the axis legend.

These changes need to be made via matplotlib's bar() called on the axis (ax) instead of pandas's bar() called on the data (hist).

Example Plotting

%matplotlib inline

import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')

# Setup size and distribution
size = 50000
distribution = stats.norm()

# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)

# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)

# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)

# Plot previously histogrammed data
ax = pdf.plot(lw=2, label='PDF', legend=True)
w = abs(hist.index[1]) - abs(hist.index[0])
ax.bar(hist.index, hist.values, width=w, alpha=0.5, align='center')
ax.legend(['PDF', 'Random Samples'])

histogrammed plot

101

answered Sep 28 '22 00:09

tmthydvnprt

Another, simpler solution is to create fake samples that reproduce the same histogram and then simply use hist().

I.e., after retrieving bins and counts from stored data, do

fake = np.array([])
for i in range(len(counts)):
    a, b = bins[i], bins[i+1]
    sample = a + (b-a)*np.random.rand(counts[i])
    fake = np.append(fake, sample)

plt.hist(fake, bins=bins)

answered Sep 28 '22 01:09

Gregor Mitscha-Baude

Related questions
                            
                                How to calculate the inverse of the log normal cumulative distribution function in python?
                            
                                which python neo4j drivers are stable/production ready?
                            
                                Can i press two keys simultaneously for a single event using Pygame?
                            
                                How can I use threading in Python to parallelize AWS S3 API calls?
                            
                                Define a column type as 'list' in Pandas
                            
                                flask sqlalchemy multiple foreign keys in relationship
                            
                                Flask-SQLAlchemy - TypeError: __init__() takes only 1 position
                            
                                sklearn.tree.export_graphviz alternatives
                            
                                'exit' is not a keyword in Python, but no error occurs while using it
                            
                                Removing intersection between data frame based on multiple columns
                            
                                What is a right way for REST API response?
                            
                                Python one liner to substitute a list indices
                            
                                Pandas: Convert lists within a single column to multiple columns
                            
                                How i can disable alembic logging at runtime?
                            
                                High-dimensional data structure in Python
                            
                                How to sort a list of strings with a different order?
                            
                                Stanford CoreNLP OpenIE annotator
                            
                                Pandas filter columns of a DataFrame with bool
                            
                                touch a directory in python (Linux) [duplicate]
                            
                                how to activate the ananconda's env python in emacs?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to make matplotlib/pandas bar chart look like hist chart?

Tags:

python

pandas

matplotlib

plot

numpy

Plotting Differences between `bar` and `hist`

Example Data Generation

`hist()` Plotting

`bar()` Plotting

How can the `bar` plot be made to look like the `hist` plot?

tmthydvnprt

People also ask

2 Answers

Bar plotting differences

Example Plotting

tmthydvnprt

Gregor Mitscha-Baude

Recent Activity

Donate For Us

How to make matplotlib/pandas bar chart look like hist chart?

Tags:

python

pandas

matplotlib

plot

numpy

Plotting Differences between bar and hist

Example Data Generation

hist() Plotting

bar() Plotting

How can the bar plot be made to look like the hist plot?

tmthydvnprt

People also ask

2 Answers

Bar plotting differences

Example Plotting

tmthydvnprt

Gregor Mitscha-Baude

Related questions

Recent Activity

Donate For Us

Plotting Differences between `bar` and `hist`

`hist()` Plotting

`bar()` Plotting

How can the `bar` plot be made to look like the `hist` plot?