Seaborn timeseries plot with multiple series

Tags:

I'm trying to make a time series plot with seaborn from a dataframe that has multiple series.

From this post: seaborn time series from pandas dataframe

I gather that tsplot isn't going to work as it is meant to plot uncertainty.

So is there another Seaborn method that is meant for line charts with multiple series?

My dataframe looks like this:

print(df.info())
print(df.describe())
print(df.values)
print(df.index)

output:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 253 entries, 2013-01-03 to 2014-01-03
Data columns (total 5 columns):
Equity(24 [AAPL])      253 non-null float64
Equity(3766 [IBM])     253 non-null float64
Equity(5061 [MSFT])    253 non-null float64
Equity(6683 [SBUX])    253 non-null float64
Equity(8554 [SPY])     253 non-null float64
dtypes: float64(5)
memory usage: 11.9 KB
None
       Equity(24 [AAPL])  Equity(3766 [IBM])  Equity(5061 [MSFT])  \
count         253.000000          253.000000           253.000000   
mean           67.560593          194.075383            32.547436   
std             6.435356           11.175226             3.457613   
min            55.811000          172.820000            26.480000   
25%            62.538000          184.690000            28.680000   
50%            65.877000          193.880000            33.030000   
75%            72.299000          203.490000            34.990000   
max            81.463000          215.780000            38.970000   

       Equity(6683 [SBUX])  Equity(8554 [SPY])  
count           253.000000          253.000000  
mean             33.773277          164.690180  
std               4.597291           10.038221  
min              26.610000          145.540000  
25%              29.085000          156.130000  
50%              33.650000          165.310000  
75%              38.280000          170.310000  
max              40.995000          184.560000  
[[  77.484  195.24    27.28    27.685  145.77 ]
 [  75.289  193.989   26.76    27.85   146.38 ]
 [  74.854  193.2     26.71    27.875  145.965]
 ..., 
 [  80.167  187.51    37.43    39.195  184.56 ]
 [  79.034  185.52    37.145   38.595  182.95 ]
 [  77.284  186.66    36.92    38.475  182.8  ]]
DatetimeIndex(['2013-01-03', '2013-01-04', '2013-01-07', '2013-01-08',
               '2013-01-09', '2013-01-10', '2013-01-11', '2013-01-14',
               '2013-01-15', '2013-01-16', 
               ...
               '2013-12-19', '2013-12-20', '2013-12-23', '2013-12-24',
               '2013-12-26', '2013-12-27', '2013-12-30', '2013-12-31',
               '2014-01-02', '2014-01-03'],
              dtype='datetime64[ns]', length=253, freq=None, tz='UTC')

This works (but I want to get my hands dirty with Seaborn):

df.plot()

Output:

enter image description here

Thank you for your time!

Update1:

df.to_dict() returned: https://gist.github.com/anonymous/2bdc1ce0f9d0b6ccd6675ab4f7313a5f

Update2:

Using @knagaev sample code, I've narrowed it down to this difference:

current dataframe (output of print(current_df)):

                           Equity(24 [AAPL])  Equity(3766 [IBM])  \
2013-01-03 00:00:00+00:00             77.484            195.2400   
2013-01-04 00:00:00+00:00             75.289            193.9890   
2013-01-07 00:00:00+00:00             74.854            193.2000   
2013-01-08 00:00:00+00:00             75.029            192.8200   
2013-01-09 00:00:00+00:00             73.873            192.3800

desired dataframe (output of print(desired_df)):

           Date Company       Kind            Price
0    2014-01-02     IBM       Open       187.210007
1    2014-01-02     IBM       High       187.399994
2    2014-01-02     IBM        Low       185.199997
3    2014-01-02     IBM      Close       185.529999
4    2014-01-02     IBM     Volume   4546500.000000
5    2014-01-02     IBM  Adj Close       171.971090
6    2014-01-02    MSFT       Open        37.349998
7    2014-01-02    MSFT       High        37.400002
8    2014-01-02    MSFT        Low        37.099998
9    2014-01-02    MSFT      Close        37.160000
10   2014-01-02    MSFT     Volume  30632200.000000
11   2014-01-02    MSFT  Adj Close        34.960000
12   2014-01-02    ORCL       Open        37.779999
13   2014-01-02    ORCL       High        38.029999
14   2014-01-02    ORCL        Low        37.549999
15   2014-01-02    ORCL      Close        37.840000
16   2014-01-02    ORCL     Volume  18162100.000000

What's the best way to reorganize the current_df to desired_df?

Update 3: I finally got it working from the help of @knagaev:

I had to add a dummy column as well as finesse the index:

df['Datetime'] = df.index
melted_df = pd.melt(df, id_vars='Datetime', var_name='Security', value_name='Price')
melted_df['Dummy'] = 0

sns.tsplot(melted_df, time='Datetime', unit='Dummy', condition='Security', value='Price', ax=ax)

to produce: enter image description here

772

asked May 11 '16 16:05

Zhao Li

1 Answers

You can try to get hands dirty with tsplot.

You will draw your line charts with standard errors ("statistical additions")

I tried to simulate your dataset. So here is the results

import pandas.io.data as web
from datetime import datetime
import seaborn as sns

stocks = ['ORCL', 'TSLA', 'IBM','YELP', 'MSFT']
start = datetime(2014,1,1)
end = datetime(2014,3,28)    
f = web.DataReader(stocks, 'yahoo',start,end)

df = pd.DataFrame(f.to_frame().stack()).reset_index()
df.columns = ['Date', 'Company', 'Kind', 'Price']

sns.tsplot(df, time='Date', unit='Kind', condition='Company', value='Price')

By the way this sample is very imitative. The parameter "unit" is "Field in the data DataFrame identifying the sampling unit (e.g. subject, neuron, etc.). The error representation will collapse over units at each time/condition observation. " (from documentation). So I used the 'Kind' field for illustrative purposes.

Ok, I made an example for your dataframe. It has dummy field for "noise cleaning" :)

import pandas.io.data as web
from datetime import datetime
import seaborn as sns

stocks = ['ORCL', 'TSLA', 'IBM','YELP', 'MSFT']
start = datetime(2010,1,1)
end = datetime(2015,12,31)    
f = web.DataReader(stocks, 'yahoo',start,end)

df = pd.DataFrame(f.to_frame().stack()).reset_index()
df.columns = ['Date', 'Company', 'Kind', 'Price']

df_open = df[df['Kind'] == 'Open'].copy()
df_open['Dummy'] = 0

sns.tsplot(df_open, time='Date', unit='Dummy', condition='Company', value='Price')

P.S. Thanks to @VanPeer - now you can use seaborn.lineplot for this problem

164

answered Oct 05 '22 01:10

knagaev

Related questions
                            
                                Python and Java parameter passing [duplicate]
                            
                                matplotlib: set title color in stylesheet
                            
                                Nest a flat list based on an arbitrary criterion
                            
                                Time complexity of python "set.intersection" for n sets
                            
                                Pylint for half-implemented abstract classes
                            
                                How to do `PUT` on Amazon S3 using Python Requests
                            
                                Python: POSIX character class in regex?
                            
                                Python + WSGI - Can't import my own modules from a directory?
                            
                                Why is bytearray not a Sequence in Python 2?
                            
                                Preserving Column Order - Python Pandas and Column Concat
                            
                                Is there a way to have platform-specific dependencies in environment.yml?
                            
                                Django SimpleUploadedFile with Python 3
                            
                                Cannot press button
                            
                                multiple assignments with a comma in python
                            
                                Why does my Spark run slower than pure Python? Performance comparison
                            
                                How do I subset a pandas data frame based on a list of string values?
                            
                                Validation and Test with TensorFlow
                            
                                How does one create a metaclass? [duplicate]
                            
                                Kivy: Get widgets ids and accessing widgets by unique property
                            
                                Import error: No module named _mysql

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Seaborn timeseries plot with multiple series

Tags:

python

pandas

dataframe

plot

seaborn

Zhao Li

People also ask

1 Answers

knagaev

Recent Activity

Donate For Us