I'm trying to use Altair in Python to make a bar chart where the bars have varying width depending on the data in a column of the source dataframe. The ultimate goal is to get a chart like this one: <img src="https://i.stack.imgur.com/q07fo.png" alt="A bar chart with bars of variable width"> The height of the bars corresponds to a marginal-cost of each energy-technology (given as a column in the source dataframe). The bar width corresponds to the capacity of each energy-technology (also given as a columns in the source dataframe). Colors are ordinal data also from the source dataframe. The bars are sorted in increasing order of marginal cost. (A plot like this is called a "generation stack" in the energy industry). This is easy to achieve in matplotlib like shown in the code below: <pre class="prettyprint"><code>import matplotlib.pyplot as plt # Make fake dataset height = [3, 12, 5, 18, 45] bars = ('A', 'B', 'C', 'D', 'E') # Choose the width of each bar and their positions width = [0.1,0.2,3,1.5,0.3] y_pos = [0,0.3,2,4.5,5.5] # Make the plot plt.bar(y_pos, height, width=width) plt.xticks(y_pos, bars) plt.show() </code></pre> (code from https://python-graph-gallery.com/5-control-width-and-space-in-barplots/) But is there a way to do this with Altair? I would want to do this with Altair so I can still get the other great features of Altair like a tooltip, selectors/bindings as I have lots of other data I want to show alongside the bar-chart. First 20 rows of my source data looks like this: <img src="https://i.stack.imgur.com/7bC9e.png" alt="enter image description here"> (does not match exactly the chart shown above).

In Altair, the way to do this would be to use the <code>rect</code> mark and construct your bars explicitly. Here is an example that mimics your data: <pre class="prettyprint"><code>import altair as alt import pandas as pd import numpy as np np.random.seed(0) df = pd.DataFrame({ 'MarginalCost': 100 * np.random.rand(30), 'Capacity': 10 * np.random.rand(30), 'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30) }) df = df.sort_values('MarginalCost') df['x1'] = df['Capacity'].cumsum() df['x0'] = df['x1'].shift(fill_value=0) alt.Chart(df).mark_rect().encode( x=alt.X('x0:Q', title='Capacity'), x2='x1', y=alt.Y('MarginalCost:Q', title='Marginal Cost'), color='Technology:N', tooltip=["Technology", "Capacity", "MarginalCost"] ) </code></pre> <img src="https://i.stack.imgur.com/jFtGC.png" alt="enter image description here"> To get the same result without preprocessing of the data, you can use Altair's transform syntax: <pre class="prettyprint"><code>df = pd.DataFrame({ 'MarginalCost': 100 * np.random.rand(30), 'Capacity': 10 * np.random.rand(30), 'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30) }) alt.Chart(df).transform_window( x1='sum(Capacity)', sort=[alt.SortField('MarginalCost')] ).transform_calculate( x0='datum.x1 - datum.Capacity' ).mark_rect().encode( x=alt.X('x0:Q', title='Capacity'), x2='x1', y=alt.Y('MarginalCost:Q', title='Marginal Cost'), color='Technology:N', tooltip=["Technology", "Capacity", "MarginalCost"] ) </code></pre>

Altair bar chart with bars of variable width?

Tags:

python

plot

charts

bar-chart

altair

I'm trying to use Altair in Python to make a bar chart where the bars have varying width depending on the data in a column of the source dataframe. The ultimate goal is to get a chart like this one:

A bar chart with bars of variable width

The height of the bars corresponds to a marginal-cost of each energy-technology (given as a column in the source dataframe). The bar width corresponds to the capacity of each energy-technology (also given as a columns in the source dataframe). Colors are ordinal data also from the source dataframe. The bars are sorted in increasing order of marginal cost. (A plot like this is called a "generation stack" in the energy industry). This is easy to achieve in matplotlib like shown in the code below:

import matplotlib.pyplot as plt 

# Make fake dataset
height = [3, 12, 5, 18, 45]
bars = ('A', 'B', 'C', 'D', 'E')

# Choose the width of each bar and their positions
width = [0.1,0.2,3,1.5,0.3]
y_pos = [0,0.3,2,4.5,5.5]

# Make the plot
plt.bar(y_pos, height, width=width)
plt.xticks(y_pos, bars)
plt.show()

(code from https://python-graph-gallery.com/5-control-width-and-space-in-barplots/)

But is there a way to do this with Altair? I would want to do this with Altair so I can still get the other great features of Altair like a tooltip, selectors/bindings as I have lots of other data I want to show alongside the bar-chart.

First 20 rows of my source data looks like this:

enter image description here

(does not match exactly the chart shown above).

860

asked Jan 06 '20 08:01

aska

Video Answer

1 Answers

In Altair, the way to do this would be to use the rect mark and construct your bars explicitly. Here is an example that mimics your data:

import altair as alt
import pandas as pd
import numpy as np

np.random.seed(0)

df = pd.DataFrame({
    'MarginalCost': 100 * np.random.rand(30),
    'Capacity': 10 * np.random.rand(30),
    'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})

df = df.sort_values('MarginalCost')
df['x1'] = df['Capacity'].cumsum()
df['x0'] = df['x1'].shift(fill_value=0)

alt.Chart(df).mark_rect().encode(
    x=alt.X('x0:Q', title='Capacity'),
    x2='x1',
    y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
    color='Technology:N',
    tooltip=["Technology", "Capacity", "MarginalCost"]
)

enter image description here

To get the same result without preprocessing of the data, you can use Altair's transform syntax:

df = pd.DataFrame({
    'MarginalCost': 100 * np.random.rand(30),
    'Capacity': 10 * np.random.rand(30),
    'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})

alt.Chart(df).transform_window(
    x1='sum(Capacity)',
    sort=[alt.SortField('MarginalCost')]
).transform_calculate(
    x0='datum.x1 - datum.Capacity'
).mark_rect().encode(
    x=alt.X('x0:Q', title='Capacity'),
    x2='x1',
    y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
    color='Technology:N',
    tooltip=["Technology", "Capacity", "MarginalCost"]
)

186

answered Oct 14 '22 06:10

jakevdp

Related questions
                            
                                Getting kernel error while trying to open Jupyter notebook or Spyder
                            
                                Chapel-Python integration questions
                            
                                Enable to decode/encode correctly 𐑖𐑱𐑝𐑾𐑯 𐑨𐑤𐑓𐑩𐑚𐑧𐑑 from bytes in python 3.7.3
                            
                                Return an std::vector to python as a numpy array
                            
                                Fastest way to detect the non/least-changing pixels of successive images
                            
                                sqlite3.Connection' object has no attribute 'enable_load_extension
                            
                                Filtering rows from dataframe based on the values of the previous rows
                            
                                Plotting multiple columns in a pandas line graph [duplicate]
                            
                                How to change the datatype of column in dask dataframe?
                            
                                How to setup Visual Studio Code stdin/stdout redirection for Python (debugger)?
                            
                                Get pandas.read_csv to read empty fields as NaN, and empty strings as empty strings
                            
                                tf.cast equivalent in pytorch?
                            
                                Why do very large Fibonacci numbers create an ellipse-type shape?
                            
                                Annoting date on chart
                            
                                TypeError: Tensors in list passed to 'values' of 'ConcatV2' Op have types [bool, float32] that don't all match
                            
                                How to change the font-size of text in dataframe using pandas
                            
                                Change a way fixtures are called in pytest
                            
                                What is different approach to my problem?
                            
                                How exclude !Ref tag from check-yaml git hook?
                            
                                Upgrading SQLite in Colab

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With