Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a historical timeline with Python

So I've seen a few answers on here that helped a bit, but my dataset is larger than the ones that have been answered previously. To give a sense of what I'm working with, here's a link to the full dataset. I've included a picture of one attempted solution, which was found at this link: Example Picture.

The issue is that 1. This is difficult to read and 2. I don't know how to flatten it out so that it looks like a traditional timeline. The issue becomes more apparent when I try and work with larger segments, such as this one, which is basically unreadable: It's basically unreadable. Here's the code I used to produce both of these (I just modified the included code in order to change which section of the overall dataset was used).

event = Xia['EnglishName']
begin = Xia['Start']
end = Xia['Finish']
length = Xia['Length']

plt.figure(figsize=(12,6))
plt.barh(range(len(begin)), (end-begin), .3, left=begin)
plt.tick_params(axis='both', which='major', labelsize=15)
plt.tick_params(axis='both', which='minor', labelsize=20)
plt.title('Xia Dynasty', fontsize = '25')
plt.xlabel('Year', fontsize = '20')
plt.yticks(range(len(begin)), "")
plt.xlim(-2250, -1750)
plt.ylim(-1,18)
for i in range(18):
    plt.text(begin.iloc[i] + length.iloc[i]/2, i+.25, event.iloc[i], ha='center', fontsize = '12') 

This code semi-works, but I'd prefer if the bars were either closer together or differently colored and all on the same y-value. I appreciate any and all help. I've been trying to figure this out for about two weeks now and am hitting a brick wall.

like image 752
Baxter Avatar asked Jan 27 '23 22:01

Baxter


2 Answers

I don't know whether you already resolved this problem or not, but, from what I have seen so far from your code and (also borrowing from Evgeny's code) your requirements, the only reason you have the different levels of horizontal bars because you have defined the scalar y of the barh of matplotlib (matplotlib.pyplot.barh(y, width, height=0.8, left=None, *, align='center', **kwargs) as a range. So, each successive stacked bar is being listed on a separate level.

So, I took the liberty of downloading your dataset and playing around with the code a little bit.

I created a dataframe from the google dataset and assigned each of the Dynasty (Dynasty_col column) and Age (Age_col column) with a matplotlib CSS color (this is not necessary, but, I find this easier to manage for visualisation): enter image description here

Then for the purpose of replicating your Xia Dynasty representation, I just created a subset: enter image description here

Following that I kept mostly to what your/Evgeny's code already shows with a few minor changes:

event = data_set_xia['EnglishName']
begin = data_set_xia['Start']
end = data_set_xia['Finish']
length =  data_set_xia['Length']

Here I added a level for naming with a vertical line (you can lengthen or shorten the array [-2, 2, -1, 1] to get different levels of labelling):

levels = np.tile([-2, 2, -1, 1],
                 int(np.ceil(len(begin)/4)))[:len(begin)]

import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.figure(figsize=(12,6))

Here I basically add all of the dynasties on the same y scalar (listed as 0), the rest of the line has been modified to correspond to the color of the bars and give an edgecolour.

plt.barh(0, (end-begin), color=data_set_xia.loc[:,"Dynasty_col"], height =0.3 ,left=begin, edgecolor = "black")
plt.tick_params(axis='both', which='major', labelsize=15)
plt.tick_params(axis='both', which='minor', labelsize=20)
plt.title('Xia Dynasty', fontsize = '25')
plt.xlabel('Year', fontsize = '20')
# plt.yticks(range(len(begin)), "")
ax = plt.gca()
ax.axes.yaxis.set_visible(False)
plt.xlim(-2250, -1700)
plt.ylim(-5,5)

I played around a bit with vertical lines for labels and the labels were associated with the levels to create the plot.

plt.vlines(begin+length/2, 0, levels, color="tab:red")
for i in range(18):
    plt.text(begin.iloc[i] + length.iloc[i]/2, 
             levels[i]*1.3, event.iloc[i], 
             ha='center', fontsize = '12')

plt.tight_layout()
plt.show()

This resulted in the following graphs for the Xia dynasty: enter image description here

And using a bigger subset, I could generate this other graph too: enter image description here and enter image description here

Now obviously, the longer the number of entries are, the busier and the more cluttered the graphs become and it starts looking a bit ugly, but it is still legible. Also, the code is not "perfect", I would clean it up a bit and change some command options like the color in the arguments in barh, but it works for now.

For an alternate representation, I am adding the code of staggered representation of the different dynasties by time, as some of the dynasties overlap with each other:

event = data_set_adj['EnglishName']
begin = data_set_adj['Start']
end = data_set_adj['Finish']
length =  data_set_adj['Length']
dynasty = data_set_adj['Dynasty']
dynasty_col = data_set_adj['Dynasty_col']

dict_dynasty = dict(zip(dynasty.unique(), range(0,4*len(dynasty.unique()),4)))

levels = np.tile([-1.2,1.2, -0.8, 0.8, -0.4, 0.4],
                 int(np.ceil(len(begin)/6)))[:len(begin)]

import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.figure(figsize=(20,10))

for x in range(len(dynasty)):   
    plt.vlines(begin.iloc[x]+length.iloc[x]/2, dict_dynasty[dynasty.iloc[x]], dict_dynasty[dynasty.iloc[x]]+levels[x], color="tab:red")
    plt.barh(dict_dynasty[dynasty.iloc[x]], (end.iloc[x]-begin.iloc[x]), color=dynasty_col.iloc[x], height =0.3 ,left=begin.iloc[x], edgecolor = "black", alpha = 0.5)
    if x%2==0:
        plt.text(begin.iloc[x] + length.iloc[x]/2, 
                 dict_dynasty[dynasty.iloc[x]]+1.6*levels[x], event.iloc[x], 
                 ha='center', fontsize = '8')
    else:
        plt.text(begin.iloc[x] + length.iloc[x]/2, 
                 dict_dynasty[dynasty.iloc[x]]+1.25*levels[x], event.iloc[x], 
                 ha='center', fontsize = '8')
plt.tick_params(axis='both', which='major', labelsize=15)
plt.tick_params(axis='both', which='minor', labelsize=20)
plt.title('Chinese Dynasties', fontsize = '25')
plt.xlabel('Year', fontsize = '20')
ax = plt.gca()
ax.axes.yaxis.set_visible(False)
plt.xlim(900, 1915)
plt.ylim(-4,28)


plt.tight_layout()
plt.show()

This last part was done hastily, so the code is not the neatest, but the only thing I changed here was update the y scalar of barh based on the dynasties in the data sub-set that I am considering. I have modified the levels and the fontsize for readability, you can play around with the numbers and the code to get the appropriate representations.

This results in the following representation: enter image description here

Also, as I added the Age_col column, you could categorise the whole thing as Pre-Imperial and Imperial (red or blue). I didn't attach any graphs with that for now, but that works if you add a patch of that colour with a different "zorder" around the dynasties.

For zoomable and pannable graphing, I guess using bokeh or some other similar library for plotting would be better, that way, you can keep it uncluttered and also focus on the parts that make sense?

like image 195
ramanunni.pm Avatar answered Jan 30 '23 14:01

ramanunni.pm


Something I did similar charting for a little sitcom succession diagram. The code is a bit naive (placed on github), but on encountering your question I was surprised this is still a problem for people doing similar visualisation. I was hoping there might be specialised library for historic charts.

enter image description here

like image 44
Evgeny Avatar answered Jan 30 '23 13:01

Evgeny