Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotly: How to remove empty dates from x axis?

I have a Dataframe

   Date        Category    Sum
0  2019-06-03    "25M"      34
1  2019-06-03    "25M"      60
2  2019-06-03    "50M"      23
3  2019-06-04    "25M"      67
4  2019-06-05    "50M"     -90
5  2019-06-05    "50M"     100
6  2019-06-06    "100M"     6
7  2019-06-07    "25M"     -100
8  2019-06-08    "100M"     67
9  2019-06-09    "25M"      450
10 2019-06-10    "50M"      600
11 2019-06-11    "25M"      -9
12 2019-07-12    "50M"      45
13 2019-07-13    "50M"      67
14 2019-07-14    "100M"    130
15 2019-07-14    "50M"      45
16 2019-07-15    "100M"    100
17 2019-07-16    "25M"     -90
18 2019-07-17    "25M"     700
19 2019-07-18    "25M"     -9

I want to create a plotly graph showing the addition of "Sum" for different "Category" on Every described date, but want to remove dates, if they don't have any data.

Code

df["Date"]=pd.to_datetime(df["Date"], format=("%Y%m%d"))
df=df.sort_values(["Date","Category","Sum"],ascending=False)
df=round(df.groupby(["Date","Category"]).agg({"Sum":"sum"}).reset_index(),1)


fig = px.bar(df, x=df["Date"] , y='Sum',barmode="group",color="Category") 
fig.update_xaxes(
rangeslider_visible=True,
rangeselector=dict(
    buttons=list([
        dict(count=1, label="day", step="day", stepmode="todate"),
        dict(count=24, label="montly", step="month", stepmode="todate"),
        dict(count=1, label="year", step="year", stepmode="todate"),
        dict(step="all")
    ])
   ))


fig.show()

enter image description here

I am getting graph like this but I want to remove the empty Dates from the plotly graph

like image 717
Amit Avatar asked May 19 '20 15:05

Amit


People also ask

How do you hide X axis labels in Plotly?

Toggling axis labels The axis tick mark labels can be disabled by setting the showticklabels axis property to False . Here is an example of disabling tick labels in all subplots for a faceted figure created using Plotly Express.

What does Add_trace do in Plotly?

Adding Traces New traces can be added to a plot_ly figure using the add_trace() method. This method accepts a plot_ly figure trace and adds it to the figure. This allows you to start with an empty figure, and add traces to it sequentially.

How do you hide a trace in Plotly?

In order to keep the items visible in the legend but hidden in the plot, you need to set the values to 'legendonly'. The legend entries can then still be clicked to toggle individual visibility.

How does Plotly handle missing values?

Financial time series are often fraught with missing data. And out of the box, plotly handles a series with missing timestamps visually by just displaying a line like below. But the challenge here is that plotly interprets the timestamps as a value, and inserts all missing dates in the figure.


3 Answers

I had the same problem with my graph. Just add the following to layout code:

xaxis=dict(type = "category")

Note: I have used import plotly.graph_objs as go and NOT import plotly.express as px

This worked for me. Hope it helps you too.

like image 125
Pracheta Avatar answered Nov 14 '22 22:11

Pracheta


This problem comes from the fact that plotly interprets your 'Date' as dates and creates a continuous period between the oldest and newest timestamp, effectively showing dates with no associated data as gaps. One solution is to take the first and last date in your date column, and make a complete list of dates in that period, and then sort out which dates do not have any observations, and store that in a variable named dt_breaks. Then, at last, you can include those dates in:

fig.update_xaxes(
    rangebreaks=[dict(values=dt_breaks)] # hide dates with no values
)

This will drop those dates in your visualization, and keep the x-values formatted as dates so that you can subset the data using your buttons:

enter image description here

And here, as you already know, is the same visualization without rangebreaks=[dict(values=dt_breaks)]:

enter image description here

To make this work as simply as possible, I rearranged the date column using df=df.sort_values(["Date","Category","Sum"],ascending=True) instead of df=df.sort_values(["Date","Category","Sum"],ascending=False) as in your original code snippet

Complete code:

import pandas as pd
import plotly.express as px

df = pd.DataFrame({'Date': {0: '2019-06-03',
                          1: '2019-06-03',
                          2: '2019-06-03',
                          3: '2019-06-04',
                          4: '2019-06-05',
                          5: '2019-06-05',
                          6: '2019-06-06',
                          7: '2019-06-07',
                          8: '2019-06-08',
                          9: '2019-06-09',
                          10: '2019-06-10',
                          11: '2019-06-11',
                          12: '2019-07-12',
                          13: '2019-07-13',
                          14: '2019-07-14',
                          15: '2019-07-14',
                          16: '2019-07-15',
                          17: '2019-07-16',
                          18: '2019-07-17',
                          19: '2019-07-18'},
                         'Category': {0: '"25M"',
                          1: '"25M"',
                          2: '"50M"',
                          3: '"25M"',
                          4: '"50M"',
                          5: '"50M"',
                          6: '"100M"',
                          7: '"25M"',
                          8: '"100M"',
                          9: '"25M"',
                          10: '"50M"',
                          11: '"25M"',
                          12: '"50M"',
                          13: '"50M"',
                          14: '"100M"',
                          15: '"50M"',
                          16: '"100M"',
                          17: '"25M"',
                          18: '"25M"',
                          19: '"25M"'},
                         'Sum': {0: 34,
                          1: 60,
                          2: 23,
                          3: 67,
                          4: -90,
                          5: 100,
                          6: 6,
                          7: -100,
                          8: 67,
                          9: 450,
                          10: 600,
                          11: -9,
                          12: 45,
                          13: 67,
                          14: 130,
                          15: 45,
                          16: 100,
                          17: -90,
                          18: 700,
                          19: -9}})

df["Date"]=pd.to_datetime(df["Date"], format=("%Y-%m-%d"))
df=df.sort_values(["Date","Category","Sum"],ascending=True)
df=round(df.groupby(["Date","Category"]).agg({"Sum":"sum"}).reset_index(),1)



dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1])
dt_obs = [d.strftime("%Y-%m-%d") for d in df['Date']]
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d").tolist() if not d in dt_obs]

df=df.set_index('Date')

#fig = px.bar(df, x=df.index.strftime("%Y/%m/%d") , y='Sum',barmode="group",color="Category") 
fig = px.bar(df, x=df.index , y='Sum',barmode="group",color="Category")

fig.update_xaxes(
    rangebreaks=[dict(values=dt_breaks)] # hide dates with no values
)


fig.update_xaxes(
rangeslider_visible=True,
rangeselector=dict(
    buttons=list([
        dict(count=1, label="day", step="day", stepmode="todate"),
        dict(count=24, label="montly", step="month", stepmode="todate"),
        dict(count=1, label="year", step="year", stepmode="todate"),
        dict(step="all")
    ])
   ))


fig.show()
like image 45
vestland Avatar answered Nov 14 '22 23:11

vestland


In case, someone is here playing with stocks data, Below is the code to hide outside trading hours and weekends with rangebreaks.

    fig = go.Figure(data=[go.Candlestick(x=df['date'], open=df['Open'], high=df['High'], low=df['Low'], close=df['Close'])])
    fig.update_xaxes(
        rangeslider_visible=True,
        rangebreaks=[
            # NOTE: Below values are bound (not single values), ie. hide x to y
            dict(bounds=["sat", "mon"]),  # hide weekends, eg. hide sat to before mon
            dict(bounds=[16, 9.5], pattern="hour"),  # hide hours outside of 9.30am-4pm
            # dict(values=["2020-12-25", "2021-01-01"])  # hide holidays (Christmas and New Year's, etc)
        ]
    )
    fig.update_layout(
        title='Stock Analysis',
        yaxis_title=f'{symbol} Stock'
    )

    fig.show()

here's Plotly's doc.

like image 43
Mega J Avatar answered Nov 14 '22 22:11

Mega J