Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matplotlib color line by "value" [duplicate]

Various versions of this question have been asked before, and I'm not sure if I'm supposed to ask my question on one of the threads or start a new thread. Here goes:

I have a pandas dataframe where there is a column (eg: speed) that I'm trying to plot, and then another column (eg: active) which is, for now, true/false. Depending on the value of active, I'd like to color the line plot.

This thread seems to be the "right" solution, but I'm having an issue: seaborn or matplotlib line chart, line color depending on variable The OP and I are trying to achieve the same thing:

contiguous multi-colored line

Here's a broken plot/reproducer:

Values=[3,4,6, 6,5,4, 3,2,3, 4,5,6]
Colors=['red','red', 'red', 'blue','blue','blue', 'red', 'red', 'red', 'blue', 'blue', 'blue']
myf = pd.DataFrame({'speed': Values, 'colors': Colors})

grouped = myf.groupby('colors')
fig, ax = plt.subplots(1)

for key, group in grouped:
   group.plot(ax=ax, y="speed", label=key, color=key)

The resultant plot has two issues: not only are the changed color lines not "connected", but the colors themselves connect "across" the end points:

lines are non-contiguous

What I want to see is the change from red to blue and back look like it's all one contiguous line.

Color line by third variable - Python seems to do the right thing, but I am not dealing with "linear" color data. I basically am assigning a set of line colors in a column. I could easily set the values of the color column to numericals:

Colors=['1','1', '1', '2','2'...]

if that makes generating the desired plot easier.

There is a comment in the first thread:

You could do it if you'll duplicate points when color changed, I've modified answer for that

But I basically copied and pasted the answer, so I'm not sure that comment is entirely accurate.

like image 800
Erik Jacobs Avatar asked Dec 07 '17 04:12

Erik Jacobs


2 Answers

I took a crack at it. Following the comments in the other question that you linked lead me to this. I did have to get down to matplotlib and couldn't do it in pandas itself. Once I converted the dataframe into lists, its pretty much the same code as the one from the mpl page.

I create the dataframe similar to yours:

vals=[3,4,6, 6,5,4, 3,2,3, 4,5,6]
colors=['red' if x < 5 else 'blue' for x in vals]
df = pd.DataFrame({'speed': vals, 'danger': colors})

Converting the vals and index into lists

x = df.index.tolist()
y = df['speed'].tolist()
z = np.array(list(y))

Break down the vals and index into points and then create line segments out of them.

points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)

Create the colormap based on the criteria used while creating the dataframe. In my case, speed less than 5 is red and rest is blue.

cmap = ListedColormap(['r', 'b'])
norm = BoundaryNorm([0, 4, 10], cmap.N)

Create the line segments and assign the colors accordingly

lc = LineCollection(segments, cmap=cmap, norm=norm)
lc.set_array(z)

Plot !

fig = plt.figure()
plt.gca().add_collection(lc)
plt.xlim(min(x), max(x))
plt.ylim(0, 10)

Here is the output:

enter image description here

Note: In the current code, the color of the line segment is dependent on the starting point. But hopefully, this gives you an idea.

I'm still new to answering questions here. Let me know if I need to add/remove some details. Thanks!

like image 190
vk_ Avatar answered Sep 26 '22 22:09

vk_


Setup

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

Values=[3,4,6, 6,5,4, 3,2,3, 4,5,6]
Colors=['red','red', 'red', 'blue','blue','blue', 'red', 'red', 'red', 'blue', 'blue', 'blue']
myf = pd.DataFrame({'speed': Values, 'colors': Colors})

Solution

1. Detect color change-points and label subgroups of contiguous colors, based on Pandas "diff()" with string

myf['change'] = myf.colors.ne(myf.colors.shift().bfill()).astype(int)
myf['subgroup'] = myf['change'].cumsum()

myf
   colors  speed  change  subgroup
0     red      3       0         0
1     red      4       0         0
2     red      6       0         0
3    blue      6       1         1
4    blue      5       0         1
5    blue      4       0         1
6     red      3       1         2
7     red      2       0         2
8     red      3       0         2
9    blue      4       1         3
10   blue      5       0         3
11   blue      6       0         3

2. Create gaps in the index in which to fit duplicated rows between color subgroups

myf.index += myf['subgroup'].values

myf
   colors  speed  change  subgroup
0     red      3       0         0
1     red      4       0         0
2     red      6       0         0
4    blue      6       1         1  # index is now 4; 3 is missing
5    blue      5       0         1
6    blue      4       0         1
8     red      3       1         2  # index is now 8; 7 is missing
9     red      2       0         2
10    red      3       0         2
12   blue      4       1         3  # index is now 12; 11 is missing
13   blue      5       0         3
14   blue      6       0         3

3. Save the indexes of each subgroup's first row

first_i_of_each_group = myf[myf['change'] == 1].index

first_i_of_each_group
Int64Index([4, 8, 12], dtype='int64')

4. Copy each group's first row to the previous group's last row

for i in first_i_of_each_group:
    # Copy next group's first row to current group's last row
    myf.loc[i-1] = myf.loc[i]
    # But make this new row part of the current group
    myf.loc[i-1, 'subgroup'] = myf.loc[i-2, 'subgroup']
# Don't need the change col anymore
myf.drop('change', axis=1, inplace=True)
myf.sort_index(inplace=True)
# Create duplicate indexes at each subgroup border to ensure the plot is continuous.
myf.index -= myf['subgroup'].values

myf
   colors  speed  subgroup
0     red      3         0
1     red      4         0
2     red      6         0
3    blue      6         0  # this and next row both have index = 3
3    blue      6         1  # subgroup 1 picks up where subgroup 0 left off
4    blue      5         1
5    blue      4         1
6     red      3         1
6     red      3         2
7     red      2         2
8     red      3         2
9    blue      4         2
9    blue      4         3
10   blue      5         3
11   blue      6         3

5. Plot

fig, ax = plt.subplots()
for k, g in myf.groupby('subgroup'):
    g.plot(ax=ax, y='speed', color=g['colors'].values[0], marker='o')
ax.legend_.remove()

plot output

like image 36
Peter Leimbigler Avatar answered Sep 22 '22 22:09

Peter Leimbigler