Sometimes it is needed to draw categorical values on a regular grid to show how they cover a certain area. In principle, the plot() function is a good fit for this, but there is a problem that is needed to adjust the size of the icons each time to create the illusion of a solid cover. When changing the coverage of the image, the old size becomes irrelevant and is needed to adjust it again. Is there a technique to adjust this size automatically?
using Plots
using CategoricalArrays
a = [1, 2, 3, 1, 2, 3, 1, 2, 3]
b = [1, 1, 1, 2, 2, 2, 3, 3, 3]
c = CategoricalArray(["X", "X", "Y", "Z", "Y", "Y", "Z", "Y", "Z"])
plot(a, b, group = c, seriestype = :scatter, aspect_ratio = 1, markersize=90,
markershape=:square, markerstrokewidth=0.0, xlim = (0.5, 3.5), ylim = (0.5, 3.5))
The result is good in everything, except that each time you need to adjust the size of the cells so that there are no overlapping areas or gaps:
As an alternative, I considered heatmap(), but it works quite strangely with categorical data, setting them some kind of scale of its own with a continuous gradation of values. I haven't come across any examples where using heatmap() would get a map with a beautiful legend like plot(), so I'm not sure that using heatmap() is the right way here.
a = b = [1, 2, 3]
c = CategoricalArray(["X" "X" "Y"; "Z" "Y" "Y"; "Z" "Y" "Z"])
heatmap(a, b, c)
Maybe there is still some way to automatically set the size of the cells of plot()?
The bar chart is a familiar way of visualizing categorical distributions. It displays a bar for each category. The bars are equally spaced and equally wide. The length of each bar is proportional to the frequency of the corresponding category.
Bar Charts and Pie Charts are used to visualize categorical data. Both types of graphs contain variations as displayed in the visual.
A bar chart (aka bar graph, column chart) plots numeric values for levels of a categorical feature as bars.
Mosaic plots are good for comaparing two categorical variables, particularly if you have a natural sorting or want to sort by size.
There are various ways to create such a plot within Plots.jl. Perhaps the most obvious interpretation of what you want is shapes. For that approach, you also need to understand how to group unconnected data within the same groups. A solution based on shapes could look like this:
a = [1, 2, 3, 1, 2, 3, 1, 2, 3]
b = [1, 1, 1, 2, 2, 2, 3, 3, 3]
c = CategoricalArray(["X", "X", "Y", "Z", "Y", "Y", "Z", "Y", "Z"])
groups = Dict(cat => NTuple{2,Int}[] for cat in levels(c))
for (ca, cb, cat) in zip(a,b,c)
push!(groups[cat], (ca,cb))
end
w = 1
shapes = map(collect(groups)) do (cat, vals)
cat => mapreduce(vcat, vals) do (ca, cb)
[ca cb] .+ [-.5 -.5; .5 -.5; .5 .5; -.5 .5; -.5 -.5; NaN NaN]*w
end
end
p = plot(aspect_ratio=1)
for (cat, s) in sort(shapes;by=x->x[1])
plot!(s[:,1], s[:,2], label=cat, seriestype=:shape, linewidth=0)
end
Most of the code is simply moving the data around so we get a Vector of Pairs from the categorical value to a matrix specifying all of the vertices, like this for "X":
"X" =>
12×2 Matrix{Float64}:
0.5 0.5
1.5 0.5
1.5 1.5
0.5 1.5
0.5 0.5
NaN NaN
1.5 0.5
2.5 0.5
2.5 1.5
1.5 1.5
1.5 0.5
NaN NaN
A perhaps slightly simpler solution would be to "trick" Plots to display what we want using a heatmap, like this:
a = b = [1, 2, 3]
c = CategoricalArray(["X" "X" "Y"; "Z" "Y" "Y"; "Z" "Y" "Z"])
pal = palette(:default)
p = plot(aspect_ratio=1, size=(400,400))
heatmap!(a,b,c, c=pal, colorbar=false, clims=(1,length(pal)))
for cat in sort(collect(Set(c)))
plot!(
[], [], seriestype=:shape,
label=cat, color=pal[levelcode(cat)]
)
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With