When inserting multiple rows in a MySQL-DB via a SQLA-Expression-Language statement, f.e.
Foo.__table__.insert().execute([{'bar': 1}, {'bar': 2}, {'bar': 3}])
it´s extremly slow, when compared to the execution of a "raw" sql statement for the same task, i.e.
engine.execute("insert into foo (bar) values (1),(2),(3)")
What is the reason for this? Can´t SQLA generate a single bulk insert statement and therefore executes multiple inserts? Due to the speed limits of the orm, i need a fast way to add several thousand rows at once, but the SQLA-Expression-Language-Version is too slow. So, do i need to write the raw sql by myself? The documentation isn't too clear about this.
I ran a speed test with the ORM insert, the ORM with preassigned PK and the SQLA bulk insert (see SQLA bulk insert speed) like this (https://gist.github.com/3341940):
As you can see, there is practically no difference between the three versions. Only the execution of a raw string insert, where all the records are included in the raw sql statement is significantly faster. Thus, for fast inserts, SQLA seems sub-optimal.
It seems that the special INSERT with multiple values only became recently supported(0.8 unreleased), you can see the note at the bottom of this section regarding the difference between executemany(what execute with a list does) and a multiple-VALUES INSERT:
http://docs.sqlalchemy.org/ru/latest/core/expression_api.html#sqlalchemy.sql.expression.Insert.values
This should explain the performance difference you see. You could try installing the development version and repeating the tests with the altered calling syntax, mentioned in the link, to confirm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With