Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Application configuration files in Python

Tags:

python

class

I am designing a multi tenant workload automation software (software to run Jobs automatically). For this I am creating a default Job configuration class. Configurations specified in this class will be applied to all types of Jobs by default.

Tenants (owners of Jobs) can opt to override these default configurations for their specific class of Jobs.

For example:

# Default configurations for all types of Jobs
class DefaultConfigurations:

    def __init__(self, job_defintion):
        self.job_state_database = DEFAULT_DB
        self.job_definition_repository_type = DEFAULT_REPO
        .... 
        # there will be 100's of configurations like this. 

Now if some tenant wants to override the default application configuration for their specific type of jobs, they can inherit from DefaultConfiguration class and override the configurations that they want to override.

For example:

# These overridden configurations will be applied to all the HiveJobs.
class HiveJobs(DefaultConfigurations):

     def __init__(self, job_definition):
        self.job_state_database = "sql"
        self.job_definition_repository_type = "svn"

# These overridden configurations will be applied to all the SparkJobs.
class SparkJobs(DefaultConfigurations):

     def __init__(self, job_definition):
        self.job_state_database = "MongoDb"   
        if (job_definition.technology == "Old")
            self.job_state_database = "sql"  

For all other types of jobs, default configurations will be used.

Individual jobs too have their definitions (mentioned in XML form). In an individual job definition XML file, class of job is also specified. For example, Hive Job will specify its class as "hive" in its definition.

Example of job_definition file for one of the hive jobs:

<job_definition>

    name hello_world_from_hive
    class hive
    command echo "hello world from Hive"

    cron_schedule 5 4 * * * 

</job_defintion>

At runtime, Job Executor will check the class of Job that is specified in its definition file and pick the configuration class accordingly (for example: DefaultConfigurations, HiveJobs or SparkJobs in the example above).

Job executor will construct a job_definition object from XML file and pass that Job definition object to the corresponding configuration class to get the final configurations that are needed to execute this job. This is needed so that some configurations can be added/removed based on some run time parameters too. Please note that the preference will be given to configurations overridden in individual Job definition file.

I am not sure if the above way is the best way to write such configuration files in Python.

like image 521
Lokesh Agrawal Avatar asked Oct 16 '22 11:10

Lokesh Agrawal


1 Answers

Just parameterize __init__ to set the attribute values as needed.

class Configuration:
    def __init__(self, db, repo_type):
        self.job_state_database = db
        self.job_definition_repository_type = repo_type

d = Configuration(DEFAULT_DB, DEFAULT_REPO)
hj = Configuration("sql", "svn")

If you don't want users manually passing around database and repository types, define class methods to wrap them.

class Configuration:
    def __init__(self, db, repo_type):
        self.job_state_database = db
        self.job_definition_repository_type = repo_type

    @classmethod
    def default_configuration(cls):
        return cls(DEFAULT_DB, DEFAULT_REPO)

    @classmethod
    def hive_configuration(cls):
        return cls("sql", "svn")


d = Configuration.default_configuration()
hj = Configuration.hive_configuration()

In neither case do I seen any reason to define distinct types to reflect information that is stored in the attributes.... unless the distinct types override various methods in such a way that you no longer need to store the database and repository type information explicitly. (We're getting into design issues that can't be judged based solely on the information in your question, though.)

class Configuration:
    def do_something(self):
        """Do stuff using the default database/repo"""

class HiveJob(Configuration):
    def do_something(self):
        """Do stuff using sql/svn instead"""
like image 183
chepner Avatar answered Oct 21 '22 09:10

chepner