Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python list cross-identification (overlap?) by specific equation

I have a cross-identification problem on data with 2 axes like

A = array([['x0', 'y0', 'data0', 'data0'],
           ['x0', 'y0', 'data0', 'data0'],
           ['x0', 'y0', 'data0', 'data0']])

B = array([['x1', 'y1', 'data1', 'data1'],
           ['x1', 'y1', 'data1', 'data1'],
           ['x1', 'y1', 'data1', 'data1']])   

What I need is to find the rows of 2 lists that have the same position. The position needs to be discribed as their distance is close enough, which is:

distance = acos(cos(y0)*cos(y1)*cos(x0-x1)+sin(y0)*sin(y1))
if(distance < 0.001):
    position = True

Currently, I use a code like below:

from math import *
def distance(x1,y1,x2,y2):
    a = acos(cos(y1)*cos(y2)*cos(x1-x2)+sin(y1)*sin(y2))
    if(a < 0.001):
        return True
    else:
        return False
f = open('cross-identification')
for i in range(len(A[0])):
    for j in range(len(B[0])):
        if(distance(A[0][i],A[1][i],B[0][j],B[1][j])==True):
            print(A[0][i],A[1][i],A[2][i],B[2][j],A[3][i],B[3][j],file=f)
        else:continue

It's OK with a few rows, but the problem is that I have MASSIVE data and the speed is extremely slow. Are there any ways that can make it quicker?

BTW I have read this, close to what I want but I can't just change it. Maybe I can get some help from u?

like image 392
X.Yang Avatar asked May 20 '26 03:05

X.Yang


1 Answers

In order not only to avoid the expensive Haversine formula but also to open up the option of using KDTrees, I'd recommend translating to Euclidean coordinates and distances.

def to_eucl_coords(lat, lon):
    z = np.sin(lat)
    x = np.sin(lon)*np.cos(lat)
    y = np.cos(lon)*np.cos(lat)
    return x, y, z

def to_eucl_dist(sphdist):
    return 2*np.arcsin(sphdist/2)

KDTrees are easy to use, here is a skeleton which should get you started.

from scipy.spatial import cKDTree as KDTree

eucl_1 = np.c_[to_eucl_coords(lat1, lon1)]
eucl_2 = np.c_[to_eucl_coords(lat2, lon2)]
t1, t2 = KDTree(eucl_1), KDTree(eucl2)
neighbors = t1.query_ball_tree(t2, threshold)
like image 183
Paul Panzer Avatar answered May 21 '26 17:05

Paul Panzer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!