Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Dynamic Time Warping Distance in a Pandas Data Frame

I want to calculate Dynamic Time Warping (DTW) distances in a dataframe. The result must be a new dataframe (a distance matrix) which includes the pairwise dtw distances among each row.

For Euclidean Distance I use the following code:

from scipy.spatial.distance import pdist, squareform
euclidean_dist = squareform(pdist(sample_dataframe,'euclidean'))

I need a similar code for DTW.

Thanks in advance.

like image 327
venom Avatar asked Dec 28 '15 22:12

venom


1 Answers

There are various ways one might do that. I'll leave two options bellow.

In case one wants to know the difference between the euclidean distance and DTW, this is a good resource.


Option 1

Using fastdtw.

Install it with

pip install fastdtw

Then use it as following

import numpy as np from scipy.spatial.distance import euclidean

from fastdtw import fastdtw

x = np.array([[1,1], [2,2], [3,3], [4,4], [5,5]])
y = np.array([[2,2],
[3,3], [4,4]])
distance, path = fastdtw(x, y, dist=euclidean)
print(distance)

Option 2 (Source)

def dtw(s, t):
    n, m = len(s), len(t)
    dtw_matrix = np.zeros((n+1, m+1))
    for i in range(n+1):
        for j in range(m+1):
            dtw_matrix[i, j] = np.inf
    dtw_matrix[0, 0] = 0
    
    for i in range(1, n+1):
        for j in range(1, m+1):
            cost = abs(s[i-1] - t[j-1])
            # take last min from a square box
            last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
            dtw_matrix[i, j] = cost + last_min
    return dtw_matrix 

It works like the following

enter image description here

like image 194
Gonçalo Peres Avatar answered Sep 20 '22 19:09

Gonçalo Peres