I'm currently using MySQL and Python to scrape data from the web. Specifically, I am scraping table data and inserting it into my database. My current solution works, but I feel it is extremely inefficient and will most likely lock up my database if I don't rewrite the code. Here is what I currently use (partial code):
itemBank = [] for row in rows: itemBank.append((tempRow2,tempRow1,tempRow3,tempRow4)) #append data #itemBank List of dictionaries representing data from each row of the table. i.e. ('Item_Name':"Tomatoes",'Item_Price':"10",'Item_In_Stock':"10",'Item_Max':"30") for item in itemBank: tempDict1 = item[0] tempDict2 = item[1] tempDict3 = item[2] tempDict4 = item[3] q = """ INSERT IGNORE INTO TABLE1 ( Item_Name, Item_Price, Item_In_Stock, Item_Max, Observation_Date ) VALUES ( "{0}", "{1}", "{2}", "{3}", "{4}" ) """.format(tempDict1['Item_Name'],tempDict2['Item_Price'],tempDict3['Item_In_Stock'], tempDict4['Item_Max'],getTimeExtra) try: x.execute(q) conn.commit() except: conn.rollback()
Executing each row of the table is cumbersome. I've tried using executemany
, but I can't seem to figure out how to access the values of the dictionaries correctly. So, how can I use executemany
here to insert into the database given the structure of my data?
executemany() Method. This method prepares a database operation (query or command) and executes it against all parameter sequences or mappings found in the sequence seq_of_params . In Python, a tuple containing a single value must include a comma.
execute() takes 410 ms, whereas using cursor. executemany() requires only 20 ms.
itemBank = [] for row in rows: itemBank.append(( tempRow2['Item_Name'], tempRow1['Item_Price'], tempRow3['Item_In_Stock'], tempRow4['Item_Max'], getTimeExtra )) #append data q = """ insert ignore into TABLE1 ( Item_Name, Item_Price, Item_In_Stock, Item_Max, Observation_Date ) values (%s,%s,%s,%s,%s) """ try: x.executemany(q, itemBank) conn.commit() except: conn.rollback()
Hope it will help you
For anyone who needs to use "insert or update" instead of "insert ignore" below query can be used
q = """ INSERT INTO TABLE1 (Item_Name, Item_Price, Item_In_Stock, Item_Max, Observation_Date ) values (%s,%s,%s,%s,%s) ON DUPLICATE KEY UPDATE Item_Name = VALUES(Item_Name), Item_Price = VALUES(Item_Price), Item_In_Stock = VALUES(Item_In_Stock), Item_Max = VALUES(Item_Max), Observation_Date = VALUES(Observation_Date) """
If Item_Name is the primary key of the TABLE1, then remove it from update part as below, since in update, primary key need not be updated
q = """ INSERT INTO TABLE1 (Item_Name, Item_Price, Item_In_Stock, Item_Max, Observation_Date ) values (%s,%s,%s,%s,%s) ON DUPLICATE KEY UPDATE Item_Price = VALUES(Item_Price), Item_In_Stock = VALUES(Item_In_Stock), Item_Max = VALUES(Item_Max), Observation_Date = VALUES(Observation_Date) """
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With