My code gets stuck at the clean_up()
method in MyClass()
my_class.py:
import os
import pandas as pd
import psycopg2, pymysql, pyodbc
from db_credentials_dict import db_credentials
class MyClass():
def __init__(self, from_database, to_table_name, report_name):
...
def get_sql(self):
...
def get_connection(self):
...
def query_to_csv(self):
...
def csv_to_postgres(self):
...
def extract_and_load(self):
self.query_to_csv()
self.csv_to_postgres()
def get_final_sql(self):
...
def postgres_to_csv(self):
...
def clean_up(self):
print('\nTruncating {}...'.format(self.to_table_name), end='')
with self.postgres_connection.cursor() as cursor:
cursor.execute("SELECT NOT EXISTS (SELECT 1 FROM %s)" % self.to_table_name)
empty = cursor.fetchone()[0]
if not empty:
cursor.execute("TRUNCATE TABLE %s" % self.to_table_name)
self.postgres_connection.commit()
print('DONE')
print('Removing {}...'.format(self.filename), end='')
if os.path.exists(self.filepath):
os.remove(self.filepath)
print('DONE')
else:
print('{} does not exist'.format(self.filename))
main.py:
from my_class import MyClass
from time import sleep
bookings = MyClass(from_database='...',to_table_name='...',report_name='...')
bookings2 = MyClass(from_database='...',to_table_name='...',report_name='...')
channel = MyClass(from_database='...',to_table_name='...',report_name='...')
cost = MyClass(from_database='...',to_table_name='...',report_name='...')
tables = [bookings, bookings2, channel, cost]
for table in tables:
table.extract_and_load()
daily_report = MyClass(from_database='...',to_table_name='...',report_name='...')
daily_report.postgres_to_csv()
sleep(10)
for table in tables:
table.clean_up()
When I run main.py
it runs everything up until the final loop i.e. for table in tables: table.clean_up()
. It just gets stuck there, no errors or warnings.
When running that method its own, it works fine i.e. it truncates the postgres tables. Need help getting this to work and understanding why the final method doesn't execute when all others are executed.
Output of when running clean_up()
on its own:
Truncating bookings...DONE
Removing bookings.csv...DONE
Truncating bookings2...DONE
Removing bookings2.csv...DONE
Truncating channel...DONE
Removing channel.csv...DONE
Truncating cost...DONE
Removing cost.csv...DONE
What the overall code does:
You're probably thinking that this is a madness and I agree. I'm currently making due with what I've got and can't store data on my computer because it's company data (hence the truncating and removing of the data).
print('hello geek!') With the help of the Schedule module, we can make a python script that will be executed in every given particular time interval.with this function schedule.every (5).minutes.do (func) function will call every 5 minutes.
The default implementation of Python ‘ CPython ’ uses GIL (Global Interpreter Lock) to execute exactly one thread at the same time, even if run on a multi-core processor as GIL works only on one core regardless of the number of cores present in the machine.
Each core in the CPU has its own GIL, so a quad-core CPU will have 4 GILs running separately with its own interpreter. To make our python programs run parallel we use multithreading and multiprocessing.
Apart from profiling, you could also simply add a print statement somewhere to see when certain code is reached. For example, in stats_ord_secure, you could put a print at the beginning of the loop to get a feeling of how long each iteration takes: Doing that, you can quickly see that a single iteration takes a few seconds.
TRUNCATE
clause in Postgres requires ACCESS EXCLUSIVE
lock on the relation and also may fire some BEFORE TRUNCATE
triggers.
My guess is that problem is on the Postgres side, e.g. TRUNCATE
tries to acquire ACCESS EXCLUSIVE
lock on the relation, but it is already locked by someone else.
First, check your Postgres logs.
Next, check what your Postgres backends are doing during your hanged TRUNCATE
with:
SELECT * FROM pg_stat_activity;
Next, investigate locks with:
-- View with readable locks info and filtered out locks on system tables
CREATE VIEW active_locks AS
SELECT clock_timestamp(), pg_class.relname, pg_locks.locktype, pg_locks.database,
pg_locks.relation, pg_locks.page, pg_locks.tuple, pg_locks.virtualtransaction,
pg_locks.pid, pg_locks.mode, pg_locks.granted
FROM pg_locks JOIN pg_class ON pg_locks.relation = pg_class.oid
WHERE relname !~ '^pg_' and relname <> 'active_locks';
-- Now when we want to see locks just type
SELECT * FROM active_locks;
I would suggest you to do the desired clean process all in one transaction. To Prevent undesired states.
And check If table exists in the database using the information_schema
SELECT EXISTS (
SELECT 1
FROM information_schema.tables
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'
);
Check If exists, create your truncate commands.
and execute the clean process all in one transaction.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With