/ python

Falcon framework - request data validation, serializer middleware (marshmallow)

Falcon doesn't provide any mechanism for request data validation but fortunately it's easy to add that functionality.

I will use marshmallow to validate the data which comes with a request. It allows us to create a schema and use it to validate if JSON data is correct.

You can read about Falcon middleware in documentation if you don't know how to use it.

Let's start with custom HTTPError class. Why do we need it? If we want to return message with information what's wrong with data which was sent, we have to return something like this:

{
    "title": "422 Unprocessable Entity",
    "errors": {
        "date_start": ["Missing data for required field."]
    }
}

The title is just an HTTP error description but errors come from masrhmallow. This is an example how marshmallow validates the data.

from marshmallow import fields, Schema, ValidationError


class UserSchema(Schema):
    name = fields.Str(required=True)


try:
    UserSchema(strict=True).load({})
except ValidationError as err:
    print(err.messages)

As you can see, I create a UserSchema class with single name field. This field is required. When I want to create UserSchema object without passing name, it should raise ValidationError.

Output from this script:

{'name': ['Missing data for required field.']}

We want to return this information to the user. Default HTTPError allows only to put a string as a description but we want to pass dictionary. That's why we want to change it a little bit.

import falcon


class HTTPError(falcon.HTTPError):
    """
    HTTPError that stores a dictionary of validation error messages.
    """

    def __init__(self, status, errors=None, *args, **kwargs):
        self.errors = errors
        super().__init__(status, *args, **kwargs)

    def to_dict(self, *args, **kwargs):
        """
        Override `falcon.HTTPError` to include error messages in responses.
        """

        ret = super().to_dict(*args, **kwargs)

        if self.errors is not None:
            ret['errors'] = self.errors

        return ret

We can use our new HTTPError in middleware to return validation errors. Let's write a middleware.

import falcon.status_codes as status

from marshmallow import ValidationError

from core.errors import HTTPError  # it's our new HTTPError


class SerializerMiddleware:

    def process_resource(self, req, resp, resource, params):
        req_data = req.context.get('request') or req.params

        try:
            serializer = resource.serializers[req.method.lower()]
        except (AttributeError, IndexError, KeyError):
            return
        else:
            try:
                req.context['serializer'] = serializer().load(
                    data=req_data
                ).data
            except ValidationError as err:
                raise HTTPError(status=status.HTTP_422, errors=err.messages)

This middleware gives us a possibility to validate data from query string (req.params) and from body (req.context.get('request') - default there is no request in req.context, I've used another middleware to read JSON data from the user and set it there, you can read about it here. It allows also to set a separate schema (validator) for every HTTP method. At the end, it sets serializer data in context so it's possible to read the data in API endpoint. If the data is not correct, API returns HTTP 422 error with validation message returned by marshmallow. Let's write a simple example.

First, register middleware in Falcon application.

import falcon

from core.middleware.serializers import SerializerMiddleware


app = falcon.API(middleware=[
    SerializerMiddleware(),
])

Next, create some simple marshmallow schemas.

from marshmallow import fields, Schema

class BookPostSchema(Schema):
    class Meta:
        strict = True

    title = fields.Str(required=True)

class BookDeleteSchema(Schema):
    class Meta:
        strict = True

    book_id = fields.Integer(required=True)

Simple API endpoint.

from book.serializers import BookDeleteSchema, BookPostSchema


class BookAPI:
    serializers = {
        'post': BookPostSchema,
        'delete': BookDeleteSchema
    }

    def on_post(self, req, resp):
        serializer = req.context['serializer']
        # req.context['serializer'] contains data sent by user
        # for example: print(serializer['title'])

    def on_delete(self, req, resp):
        serializer = req.context['serializer']
        print(serializer['book_id'])

    def on_put(self, self, req, resp):
        # no schema for delete method == no data validation

As you can see, we can assign different schemas to every HTTP method in our endpoint. If the data is correct, it is accessible in req.context['serializer'], if not, API returns HTTP error.

Simple, right? ;-)