Python list cross-identification (overlap?) by specific equation

Question

I have a cross-identification problem on data with 2 axes like

A = array([['x0', 'y0', 'data0', 'data0'],
           ['x0', 'y0', 'data0', 'data0'],
           ['x0', 'y0', 'data0', 'data0']])

B = array([['x1', 'y1', 'data1', 'data1'],
           ['x1', 'y1', 'data1', 'data1'],
           ['x1', 'y1', 'data1', 'data1']])

What I need is to find the rows of 2 lists that have the same position. The position needs to be discribed as their distance is close enough, which is:

distance = acos(cos(y0)*cos(y1)*cos(x0-x1)+sin(y0)*sin(y1))
if(distance < 0.001):
    position = True

Currently, I use a code like below:

from math import *
def distance(x1,y1,x2,y2):
    a = acos(cos(y1)*cos(y2)*cos(x1-x2)+sin(y1)*sin(y2))
    if(a < 0.001):
        return True
    else:
        return False
f = open('cross-identification')
for i in range(len(A[0])):
    for j in range(len(B[0])):
        if(distance(A[0][i],A[1][i],B[0][j],B[1][j])==True):
            print(A[0][i],A[1][i],A[2][i],B[2][j],A[3][i],B[3][j],file=f)
        else:continue

It's OK with a few rows, but the problem is that I have MASSIVE data and the speed is extremely slow. Are there any ways that can make it quicker?

BTW I have read this, close to what I want but I can't just change it. Maybe I can get some help from u?

Paul Panzer · Accepted Answer

In order not only to avoid the expensive Haversine formula but also to open up the option of using KDTrees, I'd recommend translating to Euclidean coordinates and distances.

def to_eucl_coords(lat, lon):
    z = np.sin(lat)
    x = np.sin(lon)*np.cos(lat)
    y = np.cos(lon)*np.cos(lat)
    return x, y, z

def to_eucl_dist(sphdist):
    return 2*np.arcsin(sphdist/2)

KDTrees are easy to use, here is a skeleton which should get you started.

from scipy.spatial import cKDTree as KDTree

eucl_1 = np.c_[to_eucl_coords(lat1, lon1)]
eucl_2 = np.c_[to_eucl_coords(lat2, lon2)]
t1, t2 = KDTree(eucl_1), KDTree(eucl2)
neighbors = t1.query_ball_tree(t2, threshold)

Python list cross-identification (overlap?) by specific equation

Tags:

python

match

numpy

overlap

X.Yang

1 Answers

Paul Panzer

Recent Activity

Donate For Us

Python list cross-identification (overlap?) by specific equation

Tags:

python

match

numpy

overlap

X.Yang

1 Answers

Paul Panzer

Related questions

Recent Activity

Donate For Us