I've got 2 questions on analyzing a GPS dataset.
1) Extracting trajectories I have a huge database of recorded GPS coordinates of the form (latitude, longitude, date-time)
. According to date-time values of consecutive records, I'm trying to extract all trajectories/paths followed by the person. For instance; say from time M
, the (x,y)
pairs are continuously changing up until time N
. After N
, the change in (x,y)
pairs decrease, at which point I conclude that the path taken from time M
to N
can be called a trajectory. Is that a decent approach to follow when extracting trajectories? Are there any well-known approaches/methods/algorithms you can suggest? Are there any data structures or formats you would like to suggest me to maintain those points in an efficient manner? Perhaps, for each trajectory, figuring out the velocity and acceleration would be useful?
2) Mining the trajectories Once I have all the trajectories followed/paths taken, how can I compare/cluster them? I would like to know if the start or end points are similar, then how do the intermediate paths compare?
How do I compare the 2 paths/routes and conclude if they are similar or not. Furthermore; how do I cluster similar paths together?
I would highly appreciate it if you can point me to a research or something similar on this matter.
The development will be in Python, but all kinds of library suggestions are welcome.
Thanks in advance.
Trajectory clustering aims at finding out trajectories that are of the same (or similar) pattern, or distinguishing some undesired behaviors (such as outliers). The activities of moving objects are often recorded as their trajectories.
The GPS trajectory of a moving object is a set of sampled positions with a time stamp and other related movement information (e.g., speed and moving direction).
Have a look at work done at Geography Department of University of Zurich, especially by Patrick Laube and Somayeh Dodge.
Have a look at the paper
Individual Movements and Geographical Data Mining. Clustering Algorithms for Highlighting Hotspots in Personal Navigation Routes
(link, presentation). It showcases use of DBSCAN Kernel Density Estimation methods on GPS data.
Also papers from Nokia's Mobile Data Challenge 2012 Workshop can be helpful here, especially:
MobReduce: Reducing State Complexity of Mobility Traces (link)
by Fabian Hartmann, Christoph P. Mayer, Ingmar Baumgart and
A Trajectory Cleaning Framework for Trajectory Clustering(link)
by Agzam Idrissov, Mario A. Nascimento, University of Alberta
1) Extracting trajectories I think you are in right direction. There are probably will be some noise in gps data, and random walking, you should do some smooth like splines to overcome it.
2) Mining the trajectories Is there are any business sense in similar trajectories? (This will help build distance metric and then you can use some of mahoot clustering algorithms) 1. I think point where some person stoped are more interesting so you can generate statistics for popularity of places. 2. If you need route similarity to find different paths to same start-end you need to cluster first start end location and then similare curves by (maximum distance beetween, integral distance - some of well known functional metrics)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With