Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quantify relative position of coordinates - python

I have a df of coordinates representing points at various timescales. I want to calculate the average these points in relation to each other.

To achieve this, I'm aiming to calculate the space between each point and the rest of the points. I'm then hoping to average these points.

The following calculates the distance between each pair of points.

import pandas as pd
from scipy.spatial import distance
import itertools

df = pd.DataFrame({   
        'Time' : [1,1,1,2,2,2,3,3,3],             
        'id' : ['A','B','C','A','B','C','A','B','C'],                 
        'X' : [1.0,3.0,2.0,2.0,4.0,3.0,3.0,5.0,4.0],
        'Y' : [1.0,1.0,0.5,2.0,2.0,2.5,3.0,3.0,3.0],
    })

ids = list(df['id'])

# get the points
points = df[["X", "Y"]].values

# calculate distance of each point from every other point.
# row i contains contains distances for point i. 
# distances[i, j] contains distance of point i from point j.
distances = distance.cdist(points, points, "euclidean")
distances = distances.flatten()

# get the start and end points
cartesian = list(itertools.product(ids, ids))

data = dict(
            start_region = [x[0] for x in cartesian],
            end_region = [x[1] for x in cartesian],
            distance = distances
        )

df1 = pd.DataFrame(data)

All I really need to output is:

   Time start_point end_point    X    Y
0     1           A         B  2.0  0.0
1     1           A         C  1.0 -0.5
2     1           B         C -1.0 -0.5
3     2           A         B  2.0  0.0
4     2           A         C  1.0  0.5
5     2           B         C -1.0  0.5
6     3           A         B  2.0  0.0
7     3           A         C  1.0  0.0
8     3           B         C -1.0  0.0

enter image description here

So the average position of these points in relation to each other would be the green coordinates.

But if I average the dataset above it displays:

enter image description here

I understand how this occurs. It's not referencing the other points.

like image 603
jonboy Avatar asked Mar 01 '26 10:03

jonboy


1 Answers

Here my take on it

import itertools

def relative_dist(gp):
     combs = list(itertools.combinations(gp.index, 2))
     df_gp = pd.concat([gp.loc[tup,:].diff() for tup in combs], keys=combs).dropna()

     return df_gp

df_dist = (df.set_index('id').groupby('Time')[['X','Y']].apply(relative_dist)
             .droplevel('id').rename_axis(['Time','start_point','end_point'])
             .reset_index())

Out[341]:
   Time start_point end_point    X    Y
0     1           A         B  2.0  0.0
1     1           A         C  1.0 -0.5
2     1           B         C -1.0 -0.5
3     2           A         B  2.0  0.0
4     2           A         C  1.0  0.5
5     2           B         C -1.0  0.5
6     3           A         B  2.0  0.0
7     3           A         C  1.0  0.0
8     3           B         C -1.0  0.0

df_avg = df_dist.groupby(['start_point','end_point'], as_index=False)[['X','Y']].mean()

Out[347]:
  start_point end_point    X    Y
0           A         B  2.0  0.0
1           A         C  1.0  0.0
2           B         C -1.0  0.0
like image 69
Andy L. Avatar answered Mar 03 '26 23:03

Andy L.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!