Handling multiple variants of a marshmallow schema

Tags:

I have a simple Flask-SQLAlchemy model, which I'm writing a REST API for:

class Report(db.Model, CRUDMixin):
    report_id = Column(Integer, primary_key=True)
    user_id = Column(Integer, ForeignKey('users.user_id'), index=True)
    report_hash = Column(Unicode, index=True, unique=True)
    created_at = Column(DateTime, nullable=False, default=dt.datetime.utcnow)
    uploaded_at = Column(DateTime, nullable=False, default=dt.datetime.utcnow)

Then I have the corresponding Marshmallow-SQLAlchemy schema:

class ReportSchema(ModelSchema):
    class Meta:
        model = Report

However, in my rest API, I need to be able to dump and load slightly different variants of this model:

When dumping all the reports (e.g. GET /reports), I want to dump all the above fields.
When dumping a single report (e.g. GET /reports/1), I want to dump all this data, and also all associated relations, such as the associated Sample objects from the sample table (one report has many Samples)
When creating a new report (e.g. POST /reports), I want the user to provide all the report fields except report_id (which will be generated), report_hash and uploaded_at (which will be calculated on the spot), and also I want them to include all the associated Sample objects in their upload.

How can I reasonably maintain 3 (or more) versions of this schema? Should I:

Have 3 separate ModelSchema subclasses? e.g. AggregateReportSchema, SingleReportSchema, and UploadReportSchema?
Have one mega-ModelSchema that includes all fields I could ever want in this schema, and then I subtract fields from it on the fly using the exclude argument in the constructor? e.g. ReportSchema(exclude=[])?
Or should I use inheritance and define a class ReportBaseSchema(ModelSchema), and the other schemas subclass this to add additional fields (e.g. class UploadReportSchema(ReportBaseSchema))?
Something else?

558

asked Jul 29 '19 08:07

Migwell

1 Answers

Since asking this question, I've done a ton of work using Marshmallow, so hopefully I can explain somewhat.

My rule of thumb is this: do as much as you can with the schema constructor (option #2), and only resort to inheritance (option #3) if you absolutely have to. Never use option #1, because that will result in unnecessary, duplicated code.

The schema constructor approach is great because:

You end up writing the least code
You never have to duplicate logic (e.g. validation)
The only, exclude, partial and unknown arguments to the schema constructor give you more than enough power to customize the individual schemas (see the documentation)
Schema subclasses can add extra settings to the schema constructor. For example marshmallow-jsonapi addds include_data, which lets you control the amount of data you return for each related resource

My original post is a situation where using the schema constructor is sufficient. You should first define a schema that includes all possibly related fields, including relationships that might be a Nested field. Then, if there are related resources or superfluous fields you don't want to include in the response sometimes, you can simply use Report(exclude=['some', 'fields']).dump() in that view method.

However, an example I've encountered where using inheritance was a better fit was when I modelled the arguments for certain graphs I was generating. Here, I wanted general arguments that would be passed into the underlying plotting library, but I wanted the child schemas to refine the schema and use more specific validations:

class PlotSchema(Schema):
    """
    Data that can be used to generate a plot
    """
    id = f.String(dump_only=True)
    type = f.String()
    x = f.List(f.Raw())
    y = f.List(f.Raw())
    text = f.List(f.Raw())
    hoverinfo = f.Str()


class TrendSchema(PlotSchema):
    """
    Data that can be used to generate a trend plot
    """
    x = f.List(f.DateTime())
    y = f.List(f.Number())

176

answered Oct 22 '22 10:10

Migwell

Related questions
                            
                                Save a model for TensorFlow Serving with api endpoint mapped to certain method using SignatureDefs?
                            
                                ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array
                            
                                Why do I get subprocess resource warnings despite the process being dead?
                            
                                Unable to install Airflow even after setting SLUGIFY_USES_TEXT_UNIDECODE and AIRFLOW_GPL_UNIDECODE
                            
                                How to find the index of a value by row in a dataframe in python and extract the value of the following column
                            
                                Search in Rotated Sorted Array in O(log n) time
                            
                                Webapp2 Python set_cookie does not support samesite cookie?
                            
                                How do I find the index at which a given value will be reached/cross by another series?
                            
                                Heroku Deployment Error: No matching distribution found for en-core-web-sm
                            
                                Cosine similarity between 0 and 1
                            
                                Speech Recognition UnknownValueError
                            
                                Pytest - Calling a fixture from another fixture
                            
                                Batch-Matrix multiplication in Pytorch - Confused with the handling of the output's dimension
                            
                                Detecting corrupt images in Tensorflow
                            
                                Show top level dependencies for a conda managed environment
                            
                                How to use Spark Streaming to read a stream and find the IP over a time Window?
                            
                                Is there a restriction on catplot with subplot?
                            
                                Python - Run Job every first Monday of month
                            
                                GCP Dataproc custom image Python environment
                            
                                Why there is no UserSet class defined in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Handling multiple variants of a marshmallow schema

Tags:

python

flask

sqlalchemy

marshmallow

Migwell

People also ask

1 Answers

Migwell

Recent Activity

Donate For Us