Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preventing multiple matches in list iteration

Tags:

python

numpy

I am relatively new to python, so I will try my best to explain what I am trying to do. I am trying to iterate through two lists of stars (which both contain record arrays), trying to match stars by their coordinates with a tolerance (in this case Ra and Dec, which are both indices within the record arrays). However, there appears to be multiple stars from one list that match the same star in the other. *This is due to both stars matching within the atol. Is there a way to prevent this? Here's what I have so far:

from __future__ import print_function
import numpy as np    

###importing data###
Astars = list()
for s in ApStars:###this is imported but not shown
    Astars.append(s)

wStars = list()
data1 = np.genfromtxt('6819.txt', dtype=None, delimiter=',', names=True)
for star in data1:
    wStars.append(star)

###beginning matching stars between the Astars and wStars###
list1 = list()
list2 = list()
for star,s in [(star,s) for star in wStars for s in Astars]:
    if np.logical_and(np.isclose(s["RA"],star["RA"], atol=0.000277778)==True , 
                      np.isclose(s["DEC"],star["DEC"],atol=0.000277778)==True): 
        if star not in list1:   
            list1.append(star) #matched wStars
        if s not in list2:
            list2.append(s) #matched Astars

I cannot decrease the atol because it goes beyond the instrumental error. What happens is this: There are multiple Wstars that match one Astar. I just want a star for a star, if it is possible.

Any suggestions?

like image 613
Thomas Grier Avatar asked Jul 29 '16 15:07

Thomas Grier


1 Answers

I would change your approach entirely to fit the fact that these are astronomical objects you are talking about. I will ignore the loading functionality and assume that you already have your input lists Astar and wStar.

We will find the closest star in wStar to each star in Astar using a Cartesian dot product. That should help resolve any ambiguities about the best match.

# Pre-process the data a little
def getCV(ra, de):
    return np.array([np.cos(aStar['DE']) * np.cos(aStar['RA']),
                     np.cos(aStar['DE']) * np.sin(aStar['RA']),
                     np.sin(aStar['DE'])])

for aStar in Astars:
    aStar['CV'] = getCV(aStar['RA'], aStar['DE'])
for wStar in wStars:
    wStar['CV'] = getCV(wStar['RA'], wStar['DE'])

# Construct lists of matching stars
aList = []
wList = []

# This an extra list of lists of stars that are within tolerance but are
# not best matches. This list will contain empty sublists, but never None
wCandidates []

for aStar in AStars:
    for wStar in wStars:
        # Use native short-circuiting, and don't explicitly test for `True`
        if np.isclose(aStar["RA"], wStar["RA"], atol=0.000277778) and \
           np.isclose(aStar["DEC"], wStar["DEC"], atol=0.000277778):
            newDot = np.dot(aStar['CV'], wStar['CV'])
            if aStar == aList[-1]:
                # This star already has a match, possibly update it
                if newDot > bestDot:
                    bestDot = newDot
                    # Move the previous best match to list of candidates
                    wCandidates[-1].append(wList[-1])
                    wList[-1] = wStar
                else:
                    wCandidates[-1].append(wStar)
             else:
                 # This star does not yet have a match
                 bestDot = newDot
                 aList.append(aStar)
                 wList.append(wStar)
                 wCandidates.append([])

The result is that the stars at each index in wList represent the best match for the corresponding star in aList. Not all stars have a match at all, so not all stars will appear in either of the lists. Note that there may be some (very unlikely) cases where a star in aList is not the best match for the one in wList.

We find the closest absolute distance between two stars by computing the Cartesian unit vectors based on these formulas and taking the dot product. The closer the dot is to one, the closer the stars are together. This should help resolve the ambiguities.

I pre-computed the cartesian vectors for the stars outside the main loop to avoid doing it over and over for wStars. The key name 'CV' stands for Cartesian Vector. Change it as you see fit.

Final note, this method does not check that a star from wStars matches with more than one AStar. It just ensures that the best wStar is selected for each AStar.

UPDATE

I added a third list to the output, which lists all the wStars candidates that were within tolerance of the corresponding AStars element, but did not get chosen as the best match.

like image 72
Mad Physicist Avatar answered Oct 04 '22 18:10

Mad Physicist