Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate speed from timestamped positions in Pandas.DataFrame

I am very new to Pandas but familiar with Numpy and Python.

Supposing I have a `Pandas.DataFrame' of X,Y points (float64) indexed by time (datetime), how can I pythonically calculate speeds from that, providing I already know how to calculate euclidean distances between points?

EDIT: I have just read the help on pandas.Series.diff(), but still I'd like to "replace" the subtraction used on diff by another function, say `euclidean_distance()'. Is there a way to do that?

DataFrame looks like (index in first column, positions in second):

2009-08-07 16:16:44    [37.800185, -122.426361]
2009-08-07 16:16:48    [37.800214, -122.426153]
2009-08-07 16:16:49    [37.800222, -122.426118]
2009-08-07 16:16:52    [37.800197, -122.426072]
2009-08-07 16:17:32    [37.800214, -122.425903]
2009-08-07 16:17:34    [37.800236, -122.425826]
2009-08-07 16:17:40    [37.800282, -122.425534]
2009-08-07 16:17:44    [37.800307, -122.425315]
2009-08-07 16:17:46    [37.800324, -122.425207]
2009-08-07 16:17:47    [37.800331, -122.425153]
2009-08-07 16:17:49    [37.800343, -122.425047]
2009-08-07 16:17:50    [37.800355, -122.424994]
2009-08-07 16:17:51    [37.800362, -122.424942]
2009-08-07 16:17:54    [37.800378, -122.424796]
2009-08-07 16:17:56    [37.800357, -122.424764]

What I want is some way to get speeds from that, providing the speed of first data sample will always be zero by definition (no known timedelta from a previous sample).

Thanks a lot!

like image 936
heltonbiker Avatar asked Sep 07 '12 23:09

heltonbiker


Video Answer


1 Answers

Would something like this work?

In [99]: df
Out[99]: 
                            X         Y
2009-08-07 00:00:00 -0.900602 -1.107547
2009-08-07 01:00:00  0.398914  1.545534
2009-08-07 02:00:00 -0.429100  2.052242
2009-08-07 03:00:00  0.857940 -0.348118
2009-08-07 04:00:00  0.394655 -1.578197
2009-08-07 05:00:00 -0.240995 -1.474097
2009-08-07 06:00:00  0.619148 -0.040635
2009-08-07 07:00:00 -1.403177 -0.187540
2009-08-07 08:00:00 -0.360626 -0.399728
2009-08-07 09:00:00  0.179741 -2.709712

In [100]: df['Time'] = df.index.asi8

In [101]: dist = df.diff().fillna(0.)

In [102]: dist['Dist'] = np.sqrt(dist.X**2 + dist.Y**2)

In [103]: dist['Speed'] = dist.Dist / (dist.Time / 1e9)

In [104]: dist
Out[104]: 
                            X         Y          Time      Dist     Speed
2009-08-07 00:00:00  0.000000  0.000000  0.000000e+00  0.000000       NaN
2009-08-07 01:00:00  1.299516  2.653081  3.600000e+12  2.954248  0.000821
2009-08-07 02:00:00 -0.828013  0.506708  3.600000e+12  0.970752  0.000270
2009-08-07 03:00:00  1.287040 -2.400360  3.600000e+12  2.723637  0.000757
2009-08-07 04:00:00 -0.463285 -1.230079  3.600000e+12  1.314430  0.000365
2009-08-07 05:00:00 -0.635650  0.104100  3.600000e+12  0.644118  0.000179
2009-08-07 06:00:00  0.860143  1.433462  3.600000e+12  1.671724  0.000464
2009-08-07 07:00:00 -2.022324 -0.146906  3.600000e+12  2.027653  0.000563
2009-08-07 08:00:00  1.042550 -0.212188  3.600000e+12  1.063924  0.000296
2009-08-07 09:00:00  0.540367 -2.309984  3.600000e+12  2.372345  0.000659
like image 181
Chang She Avatar answered Oct 19 '22 20:10

Chang She