Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple Python method taking forever to execute/run

My code gets stuck at the clean_up() method in MyClass()

my_class.py:

import os
import pandas as pd
import psycopg2, pymysql, pyodbc
from db_credentials_dict import db_credentials

class MyClass():

    def __init__(self, from_database, to_table_name, report_name):
        ...

    def get_sql(self):
        ...

    def get_connection(self):
        ...

    def query_to_csv(self):
        ...

    def csv_to_postgres(self):
        ...

    def extract_and_load(self):
        self.query_to_csv()
        self.csv_to_postgres()

    def get_final_sql(self):
        ...

    def postgres_to_csv(self):
        ...

    def clean_up(self):
        print('\nTruncating {}...'.format(self.to_table_name), end='')
        with self.postgres_connection.cursor() as cursor:
            cursor.execute("SELECT NOT EXISTS (SELECT 1 FROM %s)" % self.to_table_name)
            empty = cursor.fetchone()[0]
            if not empty:
                cursor.execute("TRUNCATE TABLE %s" % self.to_table_name)
        self.postgres_connection.commit()
        print('DONE')

        print('Removing {}...'.format(self.filename), end='')
        if os.path.exists(self.filepath):
            os.remove(self.filepath)
            print('DONE')
        else:
            print('{} does not exist'.format(self.filename))

main.py:

from my_class import MyClass
from time import sleep

bookings = MyClass(from_database='...',to_table_name='...',report_name='...')
bookings2 = MyClass(from_database='...',to_table_name='...',report_name='...')
channel = MyClass(from_database='...',to_table_name='...',report_name='...')
cost = MyClass(from_database='...',to_table_name='...',report_name='...')

tables = [bookings, bookings2, channel, cost]
for table in tables:
    table.extract_and_load()

daily_report = MyClass(from_database='...',to_table_name='...',report_name='...')
daily_report.postgres_to_csv()
sleep(10)

for table in tables:
    table.clean_up()

When I run main.py it runs everything up until the final loop i.e. for table in tables: table.clean_up(). It just gets stuck there, no errors or warnings.

When running that method its own, it works fine i.e. it truncates the postgres tables. Need help getting this to work and understanding why the final method doesn't execute when all others are executed.

Output of when running clean_up() on its own:

Truncating bookings...DONE
Removing bookings.csv...DONE

Truncating bookings2...DONE
Removing bookings2.csv...DONE

Truncating channel...DONE
Removing channel.csv...DONE

Truncating cost...DONE
Removing cost.csv...DONE

What the overall code does:

  1. Grabbing files that contain sql queries extracting data from different databases and executing those queries.
  2. Exporting them to csv
  3. Importing csv's to a postgres database
  4. Writing a postgres query that brings data together and then exports as a csv to be used in a BI tool for data visualisation
  5. Truncating the data in postgres and deleting the csv files in point 2

You're probably thinking that this is a madness and I agree. I'm currently making due with what I've got and can't store data on my computer because it's company data (hence the truncating and removing of the data).

like image 726
AK91 Avatar asked Dec 25 '19 13:12

AK91


People also ask

How to make a python script that will be executed every 5 minutes?

print('hello geek!') With the help of the Schedule module, we can make a python script that will be executed in every given particular time interval.with this function schedule.every (5).minutes.do (func) function will call every 5 minutes.

How many threads can CPython execute at the same time?

The default implementation of Python ‘ CPython ’ uses GIL (Global Interpreter Lock) to execute exactly one thread at the same time, even if run on a multi-core processor as GIL works only on one core regardless of the number of cores present in the machine.

How many Gils does it take to run a Python program?

Each core in the CPU has its own GIL, so a quad-core CPU will have 4 GILs running separately with its own interpreter. To make our python programs run parallel we use multithreading and multiprocessing.

How can I see how long each iteration takes in Python?

Apart from profiling, you could also simply add a print statement somewhere to see when certain code is reached. For example, in stats_ord_secure, you could put a print at the beginning of the loop to get a feeling of how long each iteration takes: Doing that, you can quickly see that a single iteration takes a few seconds.


2 Answers

TRUNCATE clause in Postgres requires ACCESS EXCLUSIVE lock on the relation and also may fire some BEFORE TRUNCATE triggers.

My guess is that problem is on the Postgres side, e.g. TRUNCATE tries to acquire ACCESS EXCLUSIVE lock on the relation, but it is already locked by someone else.

First, check your Postgres logs.

Next, check what your Postgres backends are doing during your hanged TRUNCATE with:

SELECT * FROM pg_stat_activity;

Next, investigate locks with:

-- View with readable locks info and filtered out locks on system tables
CREATE VIEW active_locks AS
SELECT clock_timestamp(), pg_class.relname, pg_locks.locktype, pg_locks.database,
       pg_locks.relation, pg_locks.page, pg_locks.tuple, pg_locks.virtualtransaction,
       pg_locks.pid, pg_locks.mode, pg_locks.granted
FROM pg_locks JOIN pg_class ON pg_locks.relation = pg_class.oid
WHERE relname !~ '^pg_' and relname <> 'active_locks';

-- Now when we want to see locks just type
SELECT * FROM active_locks;
like image 163
ololobus Avatar answered Oct 16 '22 22:10

ololobus


I would suggest you to do the desired clean process all in one transaction. To Prevent undesired states.

And check If table exists in the database using the information_schema

SELECT EXISTS (
   SELECT 1
   FROM   information_schema.tables 
   WHERE  table_schema = 'schema_name'
   AND    table_name = 'table_name'
   );

Check If exists, create your truncate commands.

and execute the clean process all in one transaction.

like image 21
Frederic Avatar answered Oct 16 '22 22:10

Frederic