PROGRAMMING

Google App Engine - Creating and restoring a backup of Datastore

#python , #google app engine

Backups. Everyone should have it but sometimes creating a backup could be a little.. hard ;-) Are you surprising? Me too. Let’s start.

I’m working on not a small social application. I can tell that we have 5 separate GAE applications but they are connected together (you know, frontend application, authorization and users API, main content API, API for mobile applications, tablets, etc). But i don’t want to talk about this now.

I want to talk about backups. Lately i had a quite simple task. Creating a backup of all this applications and restore it on testing environments. I thought that it would be simple. I saw special tool for creating and restoring backups in Datastore Admin. Cool! A few clicks and it will be ready! But..

A few hours later..

Status of creating a backup: Failed
TransactionFailedError: The transaction could not be committed. Please try again.

And that was only a beginning of all the problems..

A lot of investigations. Checking what can be wrong. Trying to fix the problem. I’ve found some strange, undocumented (but it is a normal in GAE..) errors.

ExistenceError: ApplicationError: 105

UnknownError: ApplicationError: 7

I saw that errors before and other mysterious numbers like 6 but this time i had to fix it. After some time we found out that it is possible to create a backup using Datastore Admin but.. we can’t select all Entity Kinds at once. It worked if we tried to select 3 or 4 Kinds. But it is not an efficient way especially if you have a few applications with a few namespaces and for example 60 Entity Kinds in every application. I’ve written to Google Support. The answer was not satisfactory. In short it is a known issue with the currently experimental Backup feature. Backups failed due to transaction collisions. Current workaround is to perform the backup on fewer kinds at a time. According to the response from Google Support, we should try to break it to smaller batches of ~5-10 Kinds and each batch can be started at the same time as there is no contention between batches. They are working on this problem and we can expect improvements in the “near future”.

I tried this. It looks like less Entity Kinds at once gives you less errors. I had to put this operation on a cron and had to be sure that it always creates a backup without any errors. At the moment our task uses a special queue for backups and creates a backup of every Entity Kind separately. It is not satisfactory, it is not fast, it is not good for restoring but it works. Without problems.

Below is a simple Backup class. I’ve removed all the code which is not related with backups like sending emails or creating a zip file from created files. It gives a possibility to set a bucket name on Google Cloud Storage where the backup should be saved and namespace from which the data should be backuped.

The operation is simple. We are getting a list of all Enity Kinds and using /_ah/datastore_admin/backup.create we are adding this task to special queue for backups.

```python line-numbers import logging from datetime import datetime

import cloudstorage as gcs

from google.appengine.ext import deferred from google.appengine.ext.db.metadata import Kind from google.appengine.api import namespace_manager from google.appengine.api import taskqueue

class Backup(object):

def __init__(self, bucket, namespace='namespace'):
    namespace_manager.set_namespace(namespace)

    date = datetime.today().strftime('%Y%m%d%H%M%S')
    self.namespace = namespace
    self.bucket_name = bucket
    self.bucket = '/{}'.format(self.bucket_name)
    self.backup_filename = '{}/{}_backup_{}.zip'.format(
        self.bucket, self.namespace, date
    )

    self.backup_create_url = (
        '/_ah/datastore_admin/backup.create?'
        'filesystem=gs'
        '&gs_bucket_name={}'
        '&name={}'
        '&kind={}'
        '&namespace={}'
    )

def kind_list(self):
    kinds = [
        kind.kind_name for kind in Kind.all()
        if not kind.kind_name.startswith('_')
    ]
    return kinds

def create_backup(self, bucket, name, kind, namespace):
    url = self.backup_create_url.format(
        bucket, name, kind, namespace
    )

    taskqueue.add(
        url=url,
        target='ah-builtin-python-bundle',
        method='GET',
        queue_name='backups'
    )

    logging.info('Creating a backup. Kind: {} Namespace: {}'.format(
        kind, namespace
    ))

def backup(self):
    kinds = self.kind_list()

    last = len(kinds) - 1

    for index, kind in enumerate(kinds):
        self.create_backup(
            self.bucket_name, kind, kind, self.namespace
        )

        if index == last:
            deferred.defer(self.compress, _queue="backups")

def compress(self):
    #  The last operation. Send email? Zip all files?
    pass

And of course our special queue. I've tried to increase bucket_size or rate but after that some errors occured like with error 7.

```yaml
- name: backups
  rate: 10/s
  bucket_size: 1
  max_concurrent_requests: 1

That’s all. Now you can write some handler to run this tasks. Just like here:

backup = Backup(namespace=namespace, bucket=bucket)
backup.backup()

After that you have to wait until the backup will be finished and you can pray that there are no errors 105 ;-)

Maybe in next entry i will describe how to crete a zip file with this backup.

I had also a lot of problems with creating and resting backup with images. I had a lot of images in Blobstore and i had to move them to Google Cloud Storage but it is also a topic for separate entry.

The last thing i wanted to mention about is restoring this backup. Since we have every Entity Kind as separate backup we need to restore it manually in Datastore Admin. We need to select every backup and restore it. It takes some time but at the moment i don’t see a better way (read: which works without problems).

I hope that Google will fix this as soon as possible. ;-)