Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to insert object if it doesn't exist with SQLAlchemy

So I'm quite new to SQLAlchemy.

I have a model Showing which has about 10,000 rows in the table. Here is the class:

class Showing(Base):
    __tablename__   = "showings"

    id              = Column(Integer, primary_key=True)
    time            = Column(DateTime)
    link            = Column(String)
    film_id         = Column(Integer, ForeignKey('films.id'))
    cinema_id       = Column(Integer, ForeignKey('cinemas.id'))

    def __eq__(self, other):
        if self.time == other.time and self.cinema == other.cinema and self.film == other.film:
            return True
        else:
            return False

Could anyone give me some guidance on the fastest way to insert a new showing if it doesn't exist already. I think it is slightly more complicated because a showing is only unique if the time, cinmea, and film are unique on a showing.

I currently have this code:

def AddShowings(self, showing_times, cinema, film):
    all_showings = self.session.query(Showing).options(joinedload(Showing.cinema), joinedload(Showing.film)).all()
    for showing_time in showing_times:
        tmp_showing = Showing(time=showing_time[0], film=film, cinema=cinema, link=showing_time[1])
        if tmp_showing not in all_showings:
            self.session.add(tmp_showing)
            self.session.commit()
            all_showings.append(tmp_showing)

which works, but seems to be very slow. Any help is much appreciated.

like image 622
user1110718 Avatar asked Sep 06 '12 09:09

user1110718


1 Answers

If any such object is unique based on a combination of columns, you need to mark these as a composite primary key. Add the primary_key=True keyword parameter to each of these columns, dropping your id column altogether:

class Showing(Base):
    __tablename__   = "showings"

    time            = Column(DateTime, primary_key=True)
    link            = Column(String)
    film_id         = Column(Integer, ForeignKey('films.id'), primary_key=True)
    cinema_id       = Column(Integer, ForeignKey('cinemas.id'), primary_key=True)

That way your database can handle these rows more efficiently (no need for an incrementing column), and SQLAlchemy now automatically knows if two instances of Showing are the same thing.

I believe you can then just merge your new Showing back into the session:

def AddShowings(self, showing_times, cinema, film):
    for showing_time in showing_times:
        self.session.merge(
            Showing(time=showing_time[0], link=showing_time[1],
                    film=film, cinema=cinema)
        )
like image 56
Martijn Pieters Avatar answered Oct 13 '22 01:10

Martijn Pieters