Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change seaborn violinplot legend labels?

I'm using seaborn to make a violinplot, which uses hues to identify who survived and who didn't. This is given by the column 'DEATH_EVENT', where 0 means the person survived and 1 means they didn't. The only issue I'm having is that I can't figure out how to set labels for this hue legend. As seen below, 'DEATH_EVENT' presents 0 and 1, but I want to change this into 'Survived' and 'Not survived'.

Current plot image.

Current code:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl
sns.set()
plt.style.use('seaborn')
data = pd.read_csv('heart_failure_clinical_records_dataset.csv')

g = sns.violinplot(data=data, x='smoking', y='age', hue='DEATH_EVENT')
g.set_xticklabels(['No smoking', 'Smoking'])

I tried to use: g.legend(labels=['Survived', 'Not survived']), but it returns it without the colors, instead a thin and thick line for some reason.

 (image).

I'm aware I could just use:

data['DEATH_EVENT'].replace({0:'Survived', 1:'Not survived'}, inplace=True)

but I wanted to see if there was another way. I'm still a rookie, so I'm guessing that there's a reason why the CSV's author made it so that it uses integers to describe plenty of things. Ex: if someone smokes or not, sex, diabetic or not, etc. Maybe it runs faster?

like image 675
tetris1 Avatar asked Oct 22 '25 12:10

tetris1


1 Answers

Controlling Seaborn legends is still somewhat tricky (some extensions to matplotlib's API would be helpful). In this case, you could grab the handles from the just-created legend and reuse them for a new legend:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.DataFrame({"smoking": np.random.randint(0, 2, 200),
                     "survived": np.random.randint(0, 2, 200),
                     "age": np.random.normal(60, 10, 200),
                     "DEATH_EVENT": np.random.randint(0, 2, 200)})
ax = sns.violinplot(data=data, x='smoking', y='age', hue='DEATH_EVENT')
ax.set_xticklabels(['No smoking', 'Smoking'])
ax.legend(handles=ax.legend_.legendHandles, labels=['Survived', 'Not survived'])

violinplot with changed legend

Here is an approach to make the change via the dataframe without changing the original dataframe. To avoid accessing ax.legend_ alltogether (to remove the legend title), a trick is to rename the column to a blank string (and use that blank string for hue). If the dataframe isn't super long (i.e. not having millions of rows), the speed and memory overhead are quite modest.

names = {0: 'Survived', 1: 'Not survived'}
ax = sns.violinplot(data=data.replace({'DEATH_EVENT': names}).rename(columns={'DEATH_EVENT': ''}),
                    x='smoking', y='age', hue='')
like image 55
JohanC Avatar answered Oct 24 '25 01:10

JohanC



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!