I am using C# and I have two list<AACoordinate>
where each element in these lists represents a 3D point in space by x,y and z.
class AACoordinate
{
public int ResiNumber { get; set; }
public double x { get; set; }
public double y { get; set; }
public double z { get; set; }
}
Each list can contain 2000 or more points and my aim is to compare each point of list1 to all the points of list2 and if the distance is smaller than a specific number I keep a record of it. at the moment I am using foreach to compare each element of list1 to all of list2. This is quite slow because of the number of points. Do you have any suggestion to make it fast?
my loop is:
foreach (var resiSet in Program.atomList1)
{
foreach (var res in Program.atomList2)
{
var dis = EuclideanDistance(resiSet, res);
if (dis < 5)
temp1.Add(resiSet.ResiNumber);
}
}
Thanks in advance for your help.
The math. dist() method returns the Euclidean distance between two points (p and q), where p and q are the coordinates of that point. Note: The two points (p and q) must be of the same dimensions.
Given two integer arrays arr1 and arr2 , and the integer d , return the distance value between the two arrays. The distance value is defined as the number of elements arr1[i] such that there is not any element arr2[j] where |arr1[i]-arr2[j]| <= d . Constraints: 1 <= arr1.
Maybe is a little complicated to implement, but I don't have any other ideas than this:
To lower down the computational complexity probably you have to use some data structure like KD-Tree or QuadTree.
You can use a KD-Tree to do nearest neighbor search, and this is what you need.
1) You build your kd-tree for the first list in O(n log n). This must be done in a single thread.
2) For each item in your second list, you do a lookup in the kd-tree for the nearest neighbor (the nearest point to the point you are looking for), in O(m log n). If the distance from current point to the nearest found point is less than your delta, you have it. If you want you can do this step using multiple threads.
So at the end the complexity of the algorithm will be O(max(n, m) * log n) where n is the number of items in the first list, m is the number of items in the second list.
For KD-Trees, see:
See http://home.wlu.edu/~levys/software/kd/ this seems a good implementation, in java and C#.
See http://www.codeproject.com/KB/architecture/KDTree.aspx
For quad trees, see:
See http://csharpquadtree.codeplex.com/
See http://www.codeproject.com/KB/recipes/QuadTree.aspx
And of course, look on Wikipedia what is a quadtree and a kd-tree
Consider that (2000 * log base 2(2000)) is about 21931.5
Instead 2000*2000 is 4000000, a big difference!
Using a parallel algorithm, if you have 4 processors, the normal O(n*n) algorithm will require 1000000 per processor, and I guess, it will be still too much if you need something fast or almost real time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With