Pandas Vs SQL Speed

Tags:

I'm hearing different views on when one should use Pandas vs when to use SQL.

I tried to do the following in Pandas on 19,150,869 rows of data:

for idx, row in df.iterrows():
    tmp = int((int(row['M']) / PeriodGranularity))+1
    row['TimeSlot'] = str(row["D"]+1) + "-" + str(row["H"]) + "-" + str(tmp)

And found it was taking so long I had to abort after 20 minutes.

I performed the following in SQLLite:

Select strftime('%w',PlayedTimestamp)+1 as D,strftime('%H',PlayedTimestamp) as H,strftime('%M',PlayedTimestamp) as M,cast(strftime('%M',PlayedTimestamp) / 15+1 as int) as TimeSlot from tblMain

and found it took 4 seconds ("19150869 rows returned in 2445ms").

Note: For the Pandas code I ran this in the step before it to get the data from the db:

sqlStr = "Select strftime('%w',PlayedTimestamp)+1 as D,strftime('%H',PlayedTimestamp) as H,strftime('%M',PlayedTimestamp) as M from tblMain"
df = pd.read_sql_query(sqlStr, con)

Is it my coding that's at fault here or is it generally accepted that for certain tasks SQL is a lot faster?

778

asked Jun 22 '17 09:06

user1761806

1 Answers

It seems you can use vectorize solution (PeriodGranularity is some variable):

df['TimeSlot'] = (df["D"]+1).astype(str) + "-" + 
                  df["H"].astype(str) + "-" + 
                 ((df['M'].astype(int) / PeriodGranularity).astype(int)+1).astype(str)

And for parse datetime to str use strftime.

DataFrame.iterrowsis really slow - check this.

First some comaprison of code for users coming from SQL background.

Comapring 2 technologies is really hard and I am not sure if some nice answer in SO (too broad reasons), but I find this.

answered Oct 03 '22 17:10

jezrael

Related questions
                            
                                IN subquery's WHERE condition affects main query - Is this a feature or a bug?
                            
                                SQL string comparision using IF
                            
                                multi-column index for string match + string similarity with pg_trgm?
                            
                                SQL Server generating XML with generic field elements
                            
                                Expected ID or Quoted_ID in SQL
                            
                                Why does jOOQ suggest to put generated code under "/target" and not under "/src"?
                            
                                Liquibase create indexes with functions
                            
                                How to delete all dependent rows
                            
                                How to use MySQL Workbench to set up connection and connect Google cloud sql
                            
                                Selecting Substring SQL
                            
                                In MySQL, how do I select a result where the result contains every value I test for?
                            
                                Optimal Postgres text index for LIKE query?
                            
                                Execute multiple SQL commands at once on R
                            
                                Save and Display Image from DataBase
                            
                                SQL - speed up query
                            
                                How to use column' value in subquery?
                            
                                Oracle SQL sum up values till another value is reached
                            
                                SQL - Get last message from each conversation
                            
                                Error 1305 when importing sql dump into mySQL Workbench
                            
                                Get rows that no foreign keys point to

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas Vs SQL Speed

Tags:

sql

sqlite

pandas

user1761806

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us