Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursively select (with limited depth) relationships in SQLAlchemy using ORM, declarative style, and Association objects

Given:

DIRECTIONS = db.Enum('N', 'NE', 'E', 'SE', 'S', 'SW', 'W', 'NW',
                     name='directions')


class Exit(BaseModel):
    __tablename__ = 'exits'
    src = db.Column(db.Integer, db.ForeignKey('room.id'), primary_key=True)
    dst = db.Column(db.Integer, db.ForeignKey('room.id'), primary_key=True)
    direction = db.Column(DIRECTIONS, primary_key=True)
    from_room = db.relationship('Room', foreign_keys=[dst],
                                backref=db.backref('exits',
                                                   lazy='dynamic'))
    to_room = db.relationship('Room', foreign_keys=[src]))

Using:

SqlAlchemy 0.9.8, PostgreSQL 9.3.5

How do I query to recursively select exits to some depth n given a starting room r?

Example (A Simple Map):

                           E                                               
                          /                                                
                B   C -- D                                                 
                 \ /                                                       
                  A                                                         
                  |                                                         
                  F                                                         
                 / \                                                        
                G   H                                                       
                     \                                                      
                      I

Supposing the relationships could be represented by the map above,

  • What query would give me all the rooms, starting with some arbitrary room?
  • How could I limit the above query to a certain number of 'hops' from the original room?
    • E.g., starting with A, select all rooms except E and I, by limiting depth to 2.

Of course, I can do this in Python:

rooms = [starting_room]                                                     
for exit in starting_room.exits.all():                                      
    if exit.to_room not in rooms:                                           
        add_to_map(exit)                                                    
        rooms.append(exit.to_room)                                          
for room in rooms[1:]:                                                      
    for exit in room.exits.all():                                           
        if exit.to_room not in rooms:                                                       add_to_map(exit)                                                
            rooms.append(exit.to_room)

But this is expensive and not suitable for more than 2 hops.

I have tried following the CTE examples in the SQLAlchemy docs, but it is difficult to grok how to make it work for an Association Object like I have, particularly as I have no experience working with CTEs in pure SQL.

My attempt:

>>> starting_exits = db.session.query(Exit).filter_by(from_room=starting_room).came='starting_exits', recursive=True)
>>> start = orm.aliased(starting_exits, name='st')
>>> exit = orm.aliased(Exit, name='e')
>>> cte = starting_exits.union_all(db.session.query(exit).filter(exit.src == start.c.dst))
>>> db.session.query(cte).all()

hangs indefinitely, even if I slice it (.all()[:5]). What should I be doing?

like image 731
Two-Bit Alchemist Avatar asked Dec 19 '22 07:12

Two-Bit Alchemist


1 Answers

I hope I did not overcomplicate your model, but in order to test the query (which follows below) I used the following model definition:

The MODEL:

class Room(Base):
    __tablename__ = 'room'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    exits = association_proxy(
        'lnk_exits', 'to_room',
        # creator=lambda v: Exit(to_room=v),
        creator=lambda k, v: Exit(direction=k, to_room=v),
    )
    entries = association_proxy(
        'lnk_entries', 'from_room',
        # creator=lambda v: Exit(from_room=v),
        creator=lambda k, v: Exit(direction=k, from_room=v),
    )


class Exit(Base):
    __tablename__ = 'exits'
    src = Column(Integer, ForeignKey('room.id'), primary_key=True)
    dst = Column(Integer, ForeignKey('room.id'), primary_key=True)
    direction = Column(DIRECTIONS, primary_key=True)

    from_room = relationship(
        Room, foreign_keys=[dst],
        # backref='lnk_exits',
        backref=backref(
            "lnk_exits",
            collection_class=attribute_mapped_collection("direction"),
            cascade="all, delete-orphan",
        )
    )
    to_room = relationship(
        Room,
        foreign_keys=[src],
        # backref='lnk_entries',
        backref=backref(
            "lnk_entries",
            collection_class=attribute_mapped_collection("direction"),
            cascade="all, delete-orphan",
        )
    )

You really do not need to use the relationships the way I did, but I like the way I did it because it allows me to work on the relationships between the rooms as below:

# Insert test data
rooms = [Room(name=name) for name in 'ABCDEFGHI']
session.add_all(rooms)

A, B, C, D, E, F, G, H, I = rooms

A.entries = {'NW': B, 'NE': C, 'S': F}
B.entries = {'SE': A}
C.entries = {'E': D, 'SW': A}
D.entries = {'W': C, 'NE': E}
E.entries = {'SW': D}
F.entries = {'N': A, 'SW': G, 'SE': H}
G.entries = {'NE': F}
H.entries = {'NW': F, 'SE': I}

if True:  # add cycle, in which case we get duplicates in the results
    B.entries['E'] = C
    C.entries['W'] = B

session.commit()

You can read more on this in the Association Proxy section of the documentation.

Now the QUERY

Please note that in order to use the query below you do not need any of the association proxy and related stuff above. Current query hangs even with simple relationship A <--> B because you the CTE will navigate it back and forth indefinitely. So the trick is to add level information to the CTE so that you can limit your search on the level. The query below should get you started:

# parameters
start_id = session.query(Room).filter(Room.name == 'A').first().id
max_level = 2

# CTE definition
starting_exits = (session.query(Exit, literal(0).label("level"))
    .filter(Exit.src == start_id)
    .cte(name="starting_exits", recursive=True)
)
start = aliased(starting_exits, name="st")

exit = aliased(Exit, name="e")
joined = (session.query(exit, (start.c.level + 1).label("level"))
    .filter(exit.src == start.c.dst)
    # @note: below line will avoid simple cycles of 2, which does not always help, but should reduce the result-set significantly already
    .filter(exit.dst != start.c.src)
    .filter(start.c.level < max_level)
)

cte = start.union_all(joined)
for x in session.query(cte).order_by(cte.c.src, cte.c.dst, cte.c.level):
    print(x)

I assume you are only interested in second column (dst) of the resulting query in order to get id of the Rooms you can reach. Probably also column four (level) would be of interest in order to find the shortest path to that room. But you might still get multiple ways to the same target room, so please filter those out ex post.

Edit: A simple way to use cte in order to get the rooms (model instances) would be:

# get all Rooms (no duplicates)
s = session.query(cte.c.dst.distinct().label("room_id")).subquery(name="rooms")
q = session.query(Room).join(s, Room.id == s.c.room_id)
for r in q:
    print(r)

Edit 2: To get the exits as well as the rooms (model instances) in order to reconstruct the graph, simply do another join on the query above:

exits = (session.query(Exit, Room)
         .join(s, Exit.dst == s.c.room_id)
         .join(Room, Room.id == s.c.room_id))
for exit, room in exits:
    print exit, room
like image 88
van Avatar answered Dec 28 '22 10:12

van