Setting Up Multiple Databases With Django


Working with multiple databases in Django can seem a bit overwhelming at first but setting it up properly is straight forward and really effective.  I'm working on a project where I need to set up an external read-only database running MySQL.  I don't typically use MySQL and I wanted to isolate the data that was being imported into this database.  The database is going to be loaded from an external source, so I don't want any migrations to run on the database either.  There are a few different approaches you can take and this is mine.

Django Settings File

Setting multiple database up in Django's settings file is straightforward and easy.  All you need to do is add a second entry to your DATABASES setting.  For this project I'm leaving the default name, which is `default`, as I want this to be the main database for my project.  However, you can change this to anything you want if that's what's good for your project.

For the second database I added a second entry to the DATABASES settings dict, and gave it a name representing the database.  Below is a what my settings look like.  One other thing to note, I like using env vars when pulling the passwords for my database connections.  This way I don't ever accidently push any sensitive passwords into my repo.

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': os.getenv('DATABASE_NAME', 'application_database'),
        'USER': os.getenv('DATABASE_USER'),
        'PASSWORD': os.getenv('DATABASE_PASSWORD'),
        'HOST': os.getenv('DATABASE_HOST', '127.0.0.1'),
        'PORT': os.getenv('DATABASE_PORT', '5432'),
    },
'gcd': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': os.getenv('GCD_DATABASE_NAME', 'gcd'),
        'USER': os.getenv('GCD_DATABASE_USER'),
        'PASSWORD': os.getenv('GCD_DATABASE_PASSWORD'),
        'HOST': os.getenv('GCD_DATABASE_HOST'),
        'PORT': os.getenv('GCD_DATABASE_PORT', '3306'),
    }
}

As you can see, the settings are very straight forward.  We're setting our engine, name, user, password and host details.   The really nice thing here is that the Django ORM makes dealing with two different types of database very easy.  As you can see, my application database is using Postgres which is always my database of choice.  The second database is using MySQL which I wouldn't use if I had the choice.  All I need to do is set the different drivers and Django does the rest.

Router

Now that we have our settings configured, our next step is to create a database router.  Django uses database routers to determine how to route your database requests to the correct database.  When using a single database you don't have to make changes to the routing system as Django has this setup by default.  However, now that we have two different databases we need to make sure our second database requests go to the same place.  Also, for my case I want to ensure no one can write to the second database from the application and no migrations can run.

To do this, I've created a router class.   Django allows you to implement specific methods on the router to ensure specific actions happen to specific databases.  You don't have to implement every method on the router, Django will use the default router if you don't implement one of the methods.  For my case I need to implement db_for_read, db_for_write and allow_migration.  I've also created a set to hold the name of the apps whose models are read only.  Before implementing your own multiple database setup, I recommend you read Django's documentation on how multi-databases work, and how their routing system works.  You can read about in detail here.

For my case I've created a routers.py module in my project and implemented the router below..

class GCDRouter(object):
    read_only_app_labels = {'gcd'}

    def db_for_read(self, model, **hints):
        if model._meta.app_label in self.read_only_app_labels:
            return 'gcd'

        return 'default'

    def db_for_write(self, model, **hints):
        if model._meta.app_label in self.read_only_app_labels:
            return None

        return 'default'

    def allow_migrate(self, db, app_label, model_name=None, **hints):
        return db == 'default'

As you can see, I have a set called read_only_app_labels that I use to keep track of apps that I want to be routed to the read only database.  In the db_for_read method, if the model's app_label is found in our read_only_app_labels we will return the 'gcd' database label.  This means any of the apps found in that set will use the 'gcd' database.  If the app_label is not found in the set, then we will use the 'default' database setting.  This is pretty straight forward and allows us to easily route requests for models found in specific apps.  I find this very clean and it's nice because you don't have to think about or write any logic to determine what database to read within your code.  It's all done in the router which is a great layer of abstraction.

The second method implemented is the db_for_write.  I'm using this method to return no database settings if the models app_label is found in our read_only_app_labels.  This will prevent Django from writing any data to this database.  This again helps us abstract this layer and doesn't need to be build in any of our logic.

Lastly, we're implementing the allow_migrate method which will return True only if the db name is our default database.  This prevents any migrations from running against the second database we've configured.

Finally, once you've configured your router you will need to register it with Django.  This is done in the settings file by adding your class and it's path to the DATABASE_ROUTERS setting.  You don't have to do anything with Django's default router as it's already registered and will be used if any of the methods are not found on the router you create.

DATABASE_ROUTERS = ['appname.path.to.router']

That's it.  That's how I've setup my Django project for multiple database support.  I think this is a great way to configure multiple database and in my opinion is the cleanest way to handle this in Django.  It's been working seamlessly for me and it feels like it's the perfect layer of abstraction.  None of the business logic or database queries need to know anything about how to route to the correct database and the restrictions are all in a single place.