Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

seaborn displot defining the column order with a vector string

I would like to adjust the order in which the columns appear on my chart, and referring to the documentation this can be done by passing a vector string into col_order

I'm not sure how to create a vector string.

I've tried the below but it didn't work, the columns are not in the correct order. I would like the column order to be :

--> primary school, some high school, completed high school, some uni, completed uni

enter image description here

Below is the code I have, and my attempt at creating a vector string.

If anyone could tell me what I'm doing wrong, that would be great.

The data is from https://archive.ics.uci.edu/ml/datasets/adult

myVector = (['primary school', 'some high school',' completed high school' 'some uni', 'uni'])
figure15 = sns.displot(x='education', hue='class-label', data=df, palette='PuBuGn', multiple='stack', aspect=2, height=8, col_order='myVector')
sns.set_context('poster', font_scale = 1 )
plt.title ("Income by education", fontsize=35)
plt.ylabel ("Income",fontsize=30)
plt.xlabel ("Education", fontsize=30)
plt.tick_params(labelsize=18)
like image 739
Amaranth Avatar asked Mar 22 '26 04:03

Amaranth


1 Answers

col_order is the order of the subplots. For figure-level functions such as sns.displot(), multiple subplots can be created using the col= parameter. To change the ordering of the x-axis, for many Seaborn functions, the order= parameter can to be used. But for sns.displot, which here extends sns.histplot, the order= parameter isn't supported. Instead, you can make the 'education' column categorical setting an explicit order.

Here is an example, using the dataset from the given link:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = pd.read_csv('adult.data', header=None, index_col=False,
                 skipinitialspace=True,
                 names=['age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation',
                        'relationship', 'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week',
                        'native-country'])

myVector = ['Preschool', '1st-4th', '5th-6th', '7th-8th', '9th', '10th', '11th', '12th', 'Some-college', 'Prof-school',
            'Assoc-acdm', 'Assoc-voc', 'HS-grad', 'Bachelors', 'Masters', 'Doctorate']

df['education'] = pd.Categorical(df['education'], categories=myVector)

sns.set_context('poster', font_scale=1)
displot_facetgrid = sns.displot(x='education', hue='workclass', data=df, palette='PuBuGn',
                                multiple='stack', aspect=2, height=8)
for ax in displot_facetgrid.axes.flat:
    plt.setp(ax.get_xticklabels(), rotation=30)
    ax.tick_params(labelsize=10)
    ax.set_xlabel("Education", fontsize=12)
plt.show()

sns.displot with ordered columns

like image 66
JohanC Avatar answered Mar 23 '26 17:03

JohanC



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!