I have a set of numbers that I'd like to plot on a histogram.
Say:
import numpy as np
import matplotlib.pyplot as plt
my_numbers = np.random.normal(size = 1000)
plt.hist(my_numbers)
If I want to control the size and range of the bins I could do this:
plt.hist(my_numbers, bins=np.arange(-4,4.5,0.5))
Now, if I want to plot a histogram in Altair the code below will do, but how do I control the size and range of the bins in Altair?
import pandas as pd
import altair as alt
my_numbers_df = pd.DataFrame.from_dict({'Integers': my_numbers})
alt.Chart(my_numbers_df).mark_bar().encode(
alt.X("Integers", bin = True),
y = 'count()',
)
I have searched Altair's docs but all their explanations and sample charts (that I could find) just said bin = True
with no further modification.
Appreciate any pointers :)
Calculate the number of bins by taking the square root of the number of data points and round up. Calculate the bin width by dividing the specification tolerance or range (USL-LSL or Max-Min value) by the # of bins.
The wider the range (bin width) you use, the fewer columns (bins) you will have. Bins that are too wide can hide important details about distribution while bins that are too narrow can cause a lot of noise and hide important information about the distribution as well.
The towers or bars of a histogram are called bins. The height of each bin shows how many values from that data fall into that range. Width of each bin is = (max value of data – min value of data) / total number of bins.
Most histograms use bin widths that are as equal as possible, but it is also possible to use unequal bin widths (see the 'Variable bin widths' section of Histogram). A recommended strategy is to size bins so the number of values they contain is approximately equal.
As demonstrated briefly in the Bin transforms section of the documentation, you can pass an alt.Bin()
instance to fine-tune the binning parameters.
The equivalent of your matplotlib histogram would be something like this:
alt.Chart(my_numbers_df).mark_bar().encode(
alt.X("Integers", bin=alt.Bin(extent=[-4, 4], step=0.5)),
y='count()',
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With