I really like the parallel coordinates plot available in Plotly but I just ran into an issue I could use help with.
As you can see in the example below performing a log10 transform allows to better distinguish the smaller values. However, by transforming the data we loose the ability to interpret the values. I would prefer to log scale the axis instead of the data but couldn't find a way to do this.
I did find something related to "axis styling" in the github issue https://github.com/plotly/plotly.js/issues/1071#issuecomment-264860379 but not a solution to this problem.
I would appreciate any ideas/pointer.
library(plotly)
# Setting up some data that span a wide range.
df <- read.csv("https://raw.githubusercontent.com/bcdunbar/datasets/master/iris.csv")
df$sepal_width[1] = 50
df$sepal_width_log10 = log10(df$sepal_width)
p <- df %>%
plot_ly(type = 'parcoords',
line = list(color = ~species_id,
colorscale = list(c(0,'red'),c(0.5,'green'),c(1,'blue'))),
dimensions = list(
list(range = c(~min(sepal_width),~max(sepal_width)),
label = 'Sepal Width', values = ~sepal_width),
list(range = c(~min(sepal_width_log10),~max(sepal_width_log10)),
tickformat='.2f',
label = 'log10(Sepal Width)', values = ~sepal_width_log10),
list(range = c(4,8),
constraintrange = c(5,6),
label = 'Sepal Length', values = ~sepal_length))
)
p
More Parallel Coordinate Examples
Plotly Parallel Coordinates Doc
Since the log projection is not supported (yet) creating tick labels manually seems to be a valid solution.
# Lets create the axis text manually and map the log10 transform
# back to the original scale.
my_tickvals = seq(min(df$sepal_width_log10), max(df$sepal_width_log10), length.out=8)
my_ticktext = signif(10 ^ my_tickvals, digits = 2)
library(plotly)
# Setting up some data that span a wide range.
df <- read.csv("https://raw.githubusercontent.com/bcdunbar/datasets/master/iris.csv")
df$sepal_width[1] = 50
df$sepal_width_log10 = log10(df$sepal_width)
# Lets create the axis text manually and map the log10 transform back to the original scale.
my_tickvals = seq(min(df$sepal_width_log10), max(df$sepal_width_log10), length.out=8)
my_ticktext = signif(10 ^ my_tickvals, digits = 2)
p <- df %>%
plot_ly(type = 'parcoords',
line = list(color = ~species_id,
colorscale = list(c(0,'red'),c(0.5,'green'),c(1,'blue'))),
dimensions = list(
list(range = c(~min(sepal_width),~max(sepal_width)),
label = 'Sepal Width', values = ~sepal_width),
list(range = c(~min(sepal_width_log10),~max(sepal_width_log10)),
tickformat='.2f',
label = 'log10(Sepal Width)', values = ~sepal_width_log10),
list(range = c(~min(sepal_width_log10),~max(sepal_width_log10)),
tickvals = my_tickvals,
ticktext = my_ticktext,
label = 'Sepal Width (log10 axis)', values = ~sepal_width_log10),
list(range = c(4,8),
constraintrange = c(5,6),
label = 'Sepal Length', values = ~sepal_length))
)
p
The underlying plotly.js
parcoords doesn't support log projection (scales, axes) at the moment, though as you mention it comes up sometimes and we plan with this functionality. In the meantime, an option is to take the logarithm of the data ahead of time, with the big drawback that axis ticks will show log values, which needs explanation and adds to cognitive burden.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With