Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

clustering for trajectories

I have large amount of temporal lat/lon.

I'm trying to find k-clusters of trajectories from this data. What is the best approach for this?

Thanks.

Edit:

How should I generate the features for my data (lat/lon + time) in order to use kmeans / hierarchical clustering?

Edit:

Hopefully this will make it clearer

Here's an example of how my data look:

Trajectory 1:

lat1,lon1 at time1
lat2,lon2 at time2
...
lat55,lon55 at time55
Trajectory 2:

lat343,lon343 at time343
lat344,lon344 at time344
...
lat376,lon376 at time376

And on and on (couple more trajectories).

So say I have 200 of these trajectories, I want to cluster them into 2 groups. How should I approach this?

Should I use kmeans/HAC for this or should I look at another method?

Edit:

The goal of this is to classify the trajectories into k clusters which represent k different directions of the trajectories.

Simply, I am just trying to cluster the trajectories into groups of different directions. I am not worried about their distances similarities.

So say the end I want to find something like this:

Direction 1:
Trajectory 4
Trajectory 5
Trajectory 7
Direction 2:
Trajectory 44
Trajectory 2
Trajectory 27

...

Direction 10:
Trajectory 17
Trajectory 8

Note: The shapes of the trajectories are mostly lines (not straight-lines), some are looped.
Note: The lat/lon are super local to one region, so I can use a flat-earth approximation.

The directions are intended to be very coarse. How do I compute similarity between trajectories to cluster them to achieve this?

Edit:

Here is an illustration (to the best of my abilities):

Trajectories and End result

I want to separate the trajectories into the directions as such.

like image 739
kietdlam Avatar asked Feb 26 '13 09:02

kietdlam


People also ask

What is trajectory clustering?

Trajectory Clustering: a Non-Parametric Method for Grouping Gene Expression Time Courses, with Applications to Mammary Development - PMC.

What is heuristic clustering?

The new heuristic clustering algorithm is a process of searching for local density maximum (centroid) and is expressed as where is iteration steps, and initial sample . In the process of clustering, each sample is an initial centroid, and each centroid can also be neighbors of other centroids.

What are two main approaches of clustering?

There are two different types of clustering, which are hierarchical and non-hierarchical methods. Non-hierarchical Clustering In this method, the dataset containing N objects is divided into M clusters. In business intelligence, the most widely used non-hierarchical clustering technique is K-means.


2 Answers

K-means is designed around minimizing variance.

When you apply it to longitudinal data, you get some error unless you are always close to the equator and stay away from the 180 meridian. Because the earth is approximately a sphere surface, not an infinite euclidean vector space.

Try a distance or density based clustering algorithm instead that can use great-circle distance, for example. Hierarchical clustering may be a better choice than k-means, too.

Great-circle distance is just between two points. So the next thing for you to do is to figure out how to combine these distances and the temporal component into an appopriate similarity measure for your trajectories. This is quite usage dependant, and there is no universal solution that we could share with you. The better your similarity function, the better your clustering results!

like image 56
Has QUIT--Anony-Mousse Avatar answered Sep 19 '22 14:09

Has QUIT--Anony-Mousse


The way you describe the problem it sounds as if you can represent all trajectories as an angle relative to the equator. It then comes down to segmenting; this is not really clustering; see e.g. https://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization. In your case the values would loop around, so it would be segmenting values on a circle (using degrees/angles) rather than on a straight line. Of course, if this describes your problem, it also provides a good way of visualising it.

like image 31
micans Avatar answered Sep 20 '22 14:09

micans