I get vertical stripes between the bins when creating a histogram with matplotlib 2.0.2, python2.7, Win7,64bit, visible both in the pdf and png created. I am usig pgf with latex to create a PDF which I will use by includegraphics in a pdflatex document. The PNG created is just a quick check.
This was not the case in Matplotlib 1.5.3. How do I get rid of these white lines separating individual bins?
Things tried:
Code to produce the image
import matplotlib as mpl
mpl.use('pgf')
pgf_with_latex = { # setup matplotlib to use latex for output
"pgf.texsystem": "pdflatex", # change this if using xetex or lautex
"text.usetex": True, # use LaTeX to write all text
"font.family": "serif",
"font.serif": [], # blank entries should cause plots to inherit fonts from the document
"font.sans-serif": [],
"font.monospace": [],
"axes.labelsize": 10, # LaTeX default is 10pt font.
"font.size": 8,
"legend.fontsize": 7, # Make the legend/label fonts a little smaller
"xtick.labelsize": 7,
"ytick.labelsize": 7,
"pgf.preamble": [
r"\usepackage[utf8x]{inputenc}", # use utf8 fonts becasue your computer can handle it :)
r"\usepackage[T1]{fontenc}", # plots will be generated using this preamble
r"\usepackage{siunitx}",
r"\DeclareSIUnit[number-unit-product = {}] ",
r"\LSB{LSB}",
]
}
mpl.rcParams.update(pgf_with_latex)
import matplotlib.pyplot as pl
import numpy as np
fig=pl.figure(figsize=(3,2))
ax1 = fig.add_subplot(111)
dat=np.random.normal(-120-60,40,200000).astype(int)
bins=np.arange(int(np.amin(dat))-.5,127.5,2)
ax1.hist(dat, bins = bins, stacked = True)
ax1.set_title("\\emph{(a)} minimal example")
ax1.set_yscale("log", nonposy="clip")
ax1.set_ylim(0.8, 20000)
ax1.set_xlim(None, 130)
ax1.set_ylabel("frequency")
ax1.set_xlabel("data")
ax1.set_xticks([-300,-200, -127,0,127])
fig.tight_layout(h_pad=1,w_pad=0.2)
pl.savefig('test.png', bbox_inches='tight',dpi=600)
pl.savefig('test.pdf', bbox_inches='tight',dpi=600)
Output of the above code:
The bins are usually specified as consecutive, non-overlapping intervals of a variable. The matplotlib.pyplot.hist () function plots a histogram. It computes and draws the histogram of x. The following table lists down the parameters for a histogram − The lower and upper range of the bins.
Divide the entire range of values into a series of intervals. Count how many values fall into each interval. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The matplotlib.pyplot.hist () function plots a histogram.
We load in the data into a DataFrame ( df ), then, we use the PyPlot instance and call the hist () function to plot a histogram for the release_year feature. By default, this'll count the number of occurences of these years, populate bars in ranges and plot the histogram. Running this code results in:
A histogram displays the shape and spread of continuous sample data. We'll be using the Netflix Shows dataset and visualizing the distributions from there. Let's import Pandas and load in the dataset:
As @unutbu pointed out in his (unfortunately now deleted) answer, not using the pgf backend will actually produce the expected plot.
Removing the line
mpl.use('pgf')
will give
If for some reason the use of the pgf backend cannot be avoided, a workaround may be to use a step function to plot the histogram. Removing ax1.hist(...)
from the code and replacing it with
hist, ex = np.histogram(dat, bins = bins)
ax1.fill_between(bins[:-1], hist, lw=0.0, step="post")
gives
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With