I want to create gantt charts in python using create_gantt. However I have issues when I have overlapping start and end dates.
Eg:
import plotly.plotly as py
import plotly.figure_factory as ff
import plotly
df = [dict(Task="Milestone A", Start='2017-01-01', Finish='2017-02-02', Resource='Jack'),
      dict(Task="Milestone B", Start='2018-01-01', Finish='2018-02-02', Resource='Jack'),
      dict(Task="Milestone A", Start='2017-01-17', Finish='2017-04-28', Resource='Joe'),
      dict(Task="Milestone B", Start='2017-03-17', Finish='2017-04-28', Resource='Joe'),
      dict(Task="Milestone A", Start='2017-01-14', Finish='2017-03-14', Resource='John'),
      dict(Task="Milestone B", Start='2018-01-14', Finish='2018-03-14', Resource='John')]
colors = {'Jack': 'rgb(220, 0, 0)',
          'Joe': (1, 0.9, 0.16),
          'John': 'rgb(0, 255, 100)'}
fig = ff.create_gantt(df, colors=colors, index_col='Resource', show_colorbar=True, group_tasks=True)
plotly.offline.plot(fig, filename='gantt-group-tasks-together')
When I run this code, The bar lines for Joe, Jack and John overlap for Milestone A and Milestone B as in the figure. Bad Output I would like to see 3 lines for Milestone A for Joe, John and Jack clustered together but not overlapped like this
How do I achieve this?Good Output
In this article, we will be discussing how to plot a Gantt Chart in Python using Matplotlib. A Gantt chart is a graphical depiction of a project schedule or task schedule (In OS). It's is a type of bar chart that shows the start and finish dates of several elements of a project that include resources or deadline.
Matplotlib is a Python library for creating charts. Matplotlib can be used in Python scripts, the Python and IPython shell, the jupyter notebook, web application servers, and four graphical user interface toolkits.
You can customize the Gantt type chart we created by adding gridlines, labels, changing the bar color, and more. To add elements to the chart, click the chart area, and on the Chart Design tab, select Add Chart Element. To select a layout, click Quick Layout.
How about altair-viz (docs)?
import pandas as pd
import numpy as np
import altair as alt
# alt.renderers.enable('notebook') # if in jupyter
df = pd.read_csv("tasks.csv")
df["Start date"] = pd.to_datetime(df["Start date"])
df["End date"] = pd.to_datetime(df["End date"])
chart = alt.Chart(df.drop("Resources", 1)).mark_bar().encode(
    x='Start date',
    x2='End date',
    y=alt.Y('Task Name', 
            sort=list(df.sort_values(["End date", "Start date"])
                                    ["Task Name"])), # Custom sorting
)
chart

Example df:
-  -----------------------------------  -------------------  -------------------  ---------------------------------
0  Data type handling / Auto Inference  2019-07-01 00:00:00  2019-07-31 00:00:00  Backend
1  Sklearn & other models               2019-07-01 00:00:00  2019-07-31 00:00:00  Models
2  Maps / Geoplotting                   2019-07-01 00:00:00  2019-07-31 00:00:00  Backend, Graphical User Interface
3  Optimize Dockerfile                  2019-07-01 00:00:00  2019-07-31 00:00:00  CI/CD
4  Chapter 2: Compare competitors       2019-07-08 00:00:00  2019-10-21 00:00:00  Writing
-  -----------------------------------  -------------------  -------------------  ---------------------------------
Edit: I also found a way to add text and make it appear as if it has a progress bar. It works by creating another series whose bars have height equal to original * progress and appending it to the original dataframe
# Use the progress to find how much of the bars should be filled
# (i.e. another end date)
df["progress date"] =  (df["End date"] - df["Start date"]) * df["Progress %"] / 100 + df["Start date"]
# Concatenate the two 
newdf = np.concatenate([df[["Task Name", "Start date", "End date", "Progress %"]].values,  
                        df[["Task Name", "Start date", "progress date", "Progress %"]].values])
newdf = pd.DataFrame(newdf, columns=["Task Name", "Start date", "End date", "Progress %"])
# Reconvert back to datetime
newdf["Start date"] = pd.to_datetime(newdf["Start date"])
newdf["End date"] = pd.to_datetime(newdf["End date"])
# This is the indicator variable (duration vs progress) where the grouping takes place
newdf["progress_"] = np.concatenate([np.ones(len(newdf)//2), np.zeros(len(newdf)//2), ])
# color for first half, color for second half
range_ = ['#1f77b4', '#5fa0d4',]
# The stacked bar chart will be our "gantt with progress"
chart = alt.Chart(newdf).mark_bar().encode(
    x=alt.X('Start date', stack=None),
    x2='End date',
    y=alt.Y('Task Name', sort=list(df.sort_values(["End date",
                                                      "Start date"])["Task Name"])*2),
    color=alt.Color('progress_', scale=alt.Scale(range=range_), legend=None)
) 
# Create appropriate labels
newdf["text%"] = newdf["Progress %"].astype(str) + " %"
# And now add those as text in the graph
text = alt.Chart(newdf).mark_text(align='left', baseline='middle', dx=5, color="white",  fontWeight="bold").encode(
    y=alt.Y('Task Name', sort=list(df.sort_values(["End date",
                                                      "Start date"])["Task Name"])*2),
    x=alt.X('Start date'),
    text='text%',
)
# Plot the graph
alt.layer(chart, text)
Result:

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With