Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoid plotting missing values in Seaborn

Problem: I have timeseries data of several days and I use sns.FacetGrid function of Seaborn python library to plot this data in facet form. In several cases, I found that mentioned seaborn function plots consecutive missing values (nan values) between two readings with a continuous line. While as matplotlib shows missing values as a gap, which makes sense. A demo example is as

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# create timeseries data for 3 days such that day two contains NaN values
time_duration1 = pd.date_range('1/1/2018', periods=24,freq='H')
data1 = np.random.randn(len(time_duration1))
ds1 = pd.Series(data=data1,index=time_duration1)
time_duration2 = pd.date_range('1/2/2018',periods=24,freq='H')
data2 = [float('nan')]*len(time_duration2)
ds2 = pd.Series(data=data2,index=time_duration2)
time_duration3 = pd.date_range('1/3/2018', periods=24,freq='H')
data3 = np.random.randn(len(time_duration3))
ds3 = pd.Series(data=data3,index=time_duration3)
# combine all three days series and then convert series into pandas dataframe
DS = pd.concat([ds1,ds2,ds3])
DF = DS.to_frame()
DF.plot()

It results into following plot enter image description here

Above Matplotlib plot shows missing values with a gap. Now let us prepare same data for seaborn function as

DF['col'] = np.ones(DF.shape[0])# dummy column but required for facets
DF['timestamp'] =  DF.index
DF.columns = ['data_val','col','timestamp']
g =  sns.FacetGrid(DF,col='col',col_wrap=1,size=2.5)
g.map_dataframe(plt.plot,'timestamp','data_val')

enter image description here

See, how seaborn plot shows missing data with a line. How should I force seaborn to not plot nan values with such a line?

Note: This is a dummy example, and I need facet grid in any case to plot my data.

like image 361
Haroon Rashid Avatar asked Jan 03 '23 02:01

Haroon Rashid


1 Answers

FacetGrid by default removes nan from the data. The reason is that some functions inside seaborn would not work properly with nans (especially some of the statistical function, I'd say).

In order to keep the nan values in the data, use the dropna=False argument to FacetGrid:

g = sns.FacetGrid(DF,... , dropna=False)
like image 83
ImportanceOfBeingErnest Avatar answered Jan 13 '23 05:01

ImportanceOfBeingErnest