Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

altair: Access rSquared-value in a regression

I am using this example https://altair-viz.github.io/user_guide/transform/regression.html for plotting a regression trendline in altair.

import altair as alt
import pandas as pd
import numpy as np

np.random.seed(42)
x = np.linspace(0, 10)
y = x - 5 + np.random.randn(len(x))

df = pd.DataFrame({'x': x, 'y': y})

chart = alt.Chart(df).mark_point().encode(
    x='x',
    y='y'
)

chart + chart.transform_regression('x', 'y').mark_line()

enter image description here

Additionally, I want to add the rSquared-value as text to the chart. How can I access the value? According to the documentation, it should be something like:

chart + chart.transform_regression('x', 'y', params=True).mark_text()
like image 501
Simon Avatar asked Jan 01 '23 07:01

Simon


2 Answers

When using mark_text() you'll need to specify the x and y location (or encoding) along with the label of the text value you want to show:

import altair as alt
import pandas as pd
import numpy as np

np.random.seed(42)
x = np.linspace(0, 10)
y = x - 5 + np.random.randn(len(x))

df = pd.DataFrame({'x': x, 'y': y})

chart = alt.Chart(df).mark_point().encode(
    x='x',
    y='y'
)
line = chart.transform_regression('x', 'y').mark_line()

params = alt.Chart(df).transform_regression(
    'x', 'y', params=True
).mark_text(align='left').encode(
    x=alt.value(20),  # pixels from left
    y=alt.value(20),  # pixels from top
    text='rSquared:N'
)

chart + line + params

enter image description here

like image 57
jakevdp Avatar answered Jan 02 '23 21:01

jakevdp


If you are also interested in accessing the regression parameters in a tabulated form, you can use an experimental method extract_data in the altair_transfrom package.

import altair as alt
import pandas as pd
import numpy as np
import altair_transform

np.random.seed(42)
x = np.linspace(0, 10)
y = x - 5 + np.random.randn(len(x))

df = pd.DataFrame({'x': x, 'y': y})
chart = alt.Chart(df).mark_point().encode(
    x='x',
    y='y'
)

b  = chart.transform_regression('x', 'y', params=True).mark_line()
print(altair_transform.extract_data(b))
#                                        coef  rSquared
# 0  [-4.935556907797029, 0.9420166005081777]  0.903174

This is a cross-post from an issue I created on the Altair source repository hosted on git. Hopefully, someone else finds this useful.

like image 39
dubbbdan Avatar answered Jan 02 '23 20:01

dubbbdan