Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

passing information from one script to another

Tags:

python

linux

unix

I have two python scripts, scriptA and scriptB, which run on Unix systems. scriptA takes 20s to run and generates a number X. scriptB needs X when it is run and takes around 500ms. I need to run scriptB everyday but scriptA only once every month. So I don't want to run scriptA from scriptB. I also don't want to manually edit scriptB each time I run scriptA. I thought of updating a file through scriptA but I'm not sure where such a file can be placed ideally so that scriptB can read it later; independent of the location of these two scripts. What is the best way of storing this value X in an Unix system so that it can be used later by scriptB?

like image 540
Sanket Avatar asked Jan 10 '23 07:01

Sanket


2 Answers

Many programs in Linux/Unix keep config in /etc/ and use subfolder in /var/ for other files.
But probably you could need root privilages.

If you run script in your home folder than you could create file ~/.scripB.rc or folder ~/.scriptB/ or ~/.config/scriptB/

See also on wikipedia Filesystem Hierarchy Standard

like image 171
furas Avatar answered Jan 12 '23 20:01

furas


It sounds like you want to serialize ScriptA's results, save it in a file or database somewhere, then have ScriptB read those results (possibly also modifying the file or updating the database entry to indicate that those results have now been processed).

To make that work you need for ScriptA and ScriptB to agree on the location and format of the data ... and you might want to implement some sort of locking to ensure that ScriptB doesn't end up with corrupted inputs if it happens to be run at the same time that ScriptA is writing or updating the data (and, conversely, that ScriptA doesn't corrupt the data store by writing thereto while ScriptB is accessing it).

Of course ScriptA and ScriptB could each have a filename or other data location hard-coded into their sources. However, that would violation the DRY Principle. So you might want them to share a configuration file. (Of course the configuration filename is also repeated in these sources ... or at least the import of the common bit of configuration code ... but the latter still ensures that an installation/configuration detail (location and, possibly, format, of the data store) is decoupled from the source code. Thus it can be changed (in the shared config) without affecting the rest of the code for either script.

As for precisely which type of file and serialization to use ... that's a different question.

These days, as strange as it may sound, I'd would suggest using SQLite3. It may seem like over-kill to use an SQL "database" for simply storing a single value. However, SQLite3 is included in the Python standard libraries, and it only needs a filename for configuration.

You could also use a pickle or JSON or even YAML (which would require a third party module) ... or even just text or some binary representation using something like struct. However, any of those will require that you parse your results and deal with any parsing or formatting errors. JSON would be the simplest option among these alternatives. Additionally you'd have to do your own file locking and handling if you wanted ScriptA and ScriptB (and, potentially, any other scripts you ever write for manipulating this particular data) to be robust against any chance of concurrent operations.

The advantage of SQLite3 is that it handles the parsing and decoding and the locking and concurrency for you. You create the table once (perhaps embedded in ScriptA as a rarely used "--initdb" option for occasions when you need to recreate the data store). Your code to read it might look as simple as:

#!/usr/bin/python
import sqlite3
db = sqlite3.connect('./foo.db')
cur = db.cursor()
results = cur.execute(SELECT value, MAX(date) FROM results').fetchone()[0]

... and writing a new value would look a bit like:

#!/usr/bin/python
# (Same import, db= and cur= from above)
with db:
    cur.execute('INSERT INTO results (value)  VALUES (?)', (myvalue,))

All of this assuming you had, at some time, initialized the data store (foo.db in this example) with something like:

#!/usr/bin/python
# (Same import, db= and cur= from above)
with db:
    cur.execute('CREATE TABLE IF NOT EXISTS results (value INTEGER NOT NULL, date TIMESTAMP DEFAULT current_timestamp)')

(Actually you could just execute that command every time if you wanted your scripts to recovery silently from cleaning out the old data).

This might seem like more code than a JSON file-based approach. However, SQLite3 is providing ACID(transactional) semantics as well as abstracting away the serialization and deserialization.

Also note that I'm glossing over a few details. My example above are actually creating a whole table of results, with timestamps for when they were written to your datastore. These would accumulate over time and, if you were using this approach, you'd periodically want to clean up your "results" table with a command like:

#!/usr/bin/python
# (Same import, db= and cur= from above)
with db:
    cur.execute('DELETE FROM results where date < ?', cur.execute('SELECT MAX(date) from results').fetchone())

Alternatively if you really never want to have access to your prior results that change from INSERT into UPDATE like so:

#!/usr/bin/python
# (Same import, db= and cur= from above)
with db:
    cur.execute(cur.execute('UPDATE results SET value=(?)', (mynewvalue,))

(Also note that the (mynewvalue,) is a single element tuple. The DBAPI requires that our parameters be wrapped in tuples which is easy to forget when you first start using it with single parameters such as this).

Obviously if you took this UPDATE only approach you could drop the 'date' column from the 'results' table and all those references to MAX(data) from the queries.

I chose use the slightly more complex schema in my early examples because they allow your scripts to be a bit more robust with very little additional complexity. You could then do other error checking, detecting missing values where ScriptB finds that ScriptA hasn't been run as intended, for example).

like image 36
Jim Dennis Avatar answered Jan 12 '23 21:01

Jim Dennis