How to Integrate Elasticsearch with DRF

How to Integrate Elasticsearch with DRF

In this article, you will learn how to create a simple search API using Elasticsearch and Django-Rest-Framework (DRF)

Elasticsearch is the leading search engine in the world.

Many companies use it such as Uber and Slack, but to leverage this technology in Django is pretty confusing.

There are three different libraries that you have to use unless you want to implement everything from scratch.

The documentation is also pretty hard to understand, and there's isn't much written about this topic.

That's why after days of struggling, I've decided to write this article that would act as a one-stop shop for integrating ElasticSearch with Django.

PS. You can find the project on GitHub below.

GitHub - TamerlanG/drf-elasticsearch-demo: Simple demo of integrating ElasticSearch with DRF
Simple demo of integrating ElasticSearch with DRF. Contribute to TamerlanG/drf-elasticsearch-demo development by creating an account on GitHub.

Prerequisites

  • Basics of Python
  • Basics of Django and Django-Rest-Framework (DRF)
  • Basics of ElasticSearch
  • Docker or a local installation of Elasticsearch

What's the project about?

To not waste any time, I already have a Django app ready with two models.

  • Articles
  • Category

This is the code for the models:

from django.db import models


class Category(models.Model):
    title = models.CharField(max_length=100)

class Article(models.Model):
    title = models.CharField(max_length=100)
    category = models.ForeignKey(
    	Category, related_name='category', on_delete=models.CASCADE
    	)

Our goal for this tutorial is to:

  • Allow users to search by title using ElasticSearch.
  • Allow users to filter articles by category using Elasticsearch
  • Create an auto-complete API for articles.

Without further ado, let's begin.

Running an Elasticsearch Instance Locally

Before we get into Django, we have to get an Elasticsearch instance running.

If you already have a local instance running you can skip this part.

My preferred method is using Docker and Docker-Compose.

This is the Dockerfile that I use for Django:

# Dockerfile
# syntax=docker/dockerfile:1
FROM python:3
ENV PYTHONUNBUFFERED=1
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/

This is my docker-compose.yml file which has two services.

  • Web
  • Elasticsearch

Here's the code:

version: "3.9"

services:
  web:
    build: .
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - .:/code
    ports:
      - "8000:8000"
    depends_on:
      - elasticsearch
    networks:
      - elastic
  elasticsearch:
    image: elasticsearch:7.14.0
    volumes:
      - ./data/elastic:/var/lib/elasticsearch/data
    environment:
      - discovery.type=single-node
    ports:
      - 9200:9200
    networks:
      - elastic
networks:
  elastic:
    driver: bridge

Once you have these two files on your root directory, you can run the containers.

docker-compose up -d 

Now you should be able to go to localhost:8000 and everything should work.

Integrating Elasticsearch with Django

Integration with Elasticsearch is pretty simple, you only need one library.

It's called django-elasticsearch-dsl, so let's install that.

pip install django-elasticsearch-dsl

Keep in mind that there are different versions of the library for different versions of ElasticSearch.

As of November 2021, these are the latest requirements:

  • For Elasticsearch 7.0 and later, use the major version 7 (7.x.y) of the library.
  • For Elasticsearch 6.0 and later, use the major version 6 (6.x.y) of the library.
  • For Elasticsearch 5.0 and later, use the major version 0.5 (0.5.x) of the library.

Once you have the library installed, make sure to add it to your installed_apps in settings.py

# settings.py

INSTALLED_APPS = [
    # Other apps 
    'django_elasticsearch_dsl',
]

Finally, we simply have to tell django-elastsearch-dsl where is our ElasticSearch instance located.

Add this to settings.py:

# settings.py

# other code....

ELASTICSEARCH_DSL = {
    'default': {
        'hosts': 'elasticsearch:9200'
    },
}

Here we are saying that ElasticSearch is located in elasticsearch:9200, this is equivalent to localhost:9200 but we use the service name when communicating between two docker-compose services.

At this point, everything should be working as usual.

Integrating Elasticsearch with DRF

Now that we have integrated ElasticSearch with Django, we have to integrate it with DRF.

For that, we use another library called django-elasticsearch-dsl-drf

pip install django-elasticsearch-dsl-drf

We have to add it to our installed_apps in settings.py:

# settings.py

INSTALLED_APPS = [
    # Other apps 
    'django_elasticsearch_dsl',
    'django_elasticsearch_dsl_drf',
]

That's it, we have fully integrated Elasticsearch with DRF.

Let's create some APIs.

Indexing our Model into Elasticsearch

Elasticsearch is working, but it's empty.

We have to tell Django to copy our existing records into ElasticSearch.

ElasticSearch stores data in JSON documents.

This means that for every model that we want to add to ElasticSearch we have to create a document class for it.

This will be stored in a documents.py file in your apps folder.

Let's create a document for our article model:

# articles/documents.py

from django_elasticsearch_dsl import Document, fields
from django_elasticsearch_dsl.registries import registry

from articles.models import Article


@registry.register_document
class ArticleDocument(Document):
    title = fields.TextField(
        attr='title',
        fields={
            'raw': fields.TextField(),
            'suggest': fields.CompletionField(),
        }
    )
    category = fields.ObjectField(
        attr='category',
        properties={
            'id': fields.IntegerField(),
            'title': fields.TextField(
                attr='title',
                fields={
                    'raw': fields.KeywordField(),
                }
            )
        }
    )

    class Index:
        name = 'articles'

    class Django:
        model = Article

This may seem complicated but we simply registered a document called ArticleDocument that is linked to the model Article. We then specified the fields that we want to index in ElasticSearch.

The first field is the title, which is a TextField with two properties.

  • raw – This is the normal ElasticSearch text field that we will use for search functionality.
  • suggest – This is a completion field that is used for auto-complete functionality.

The next field is the category which is a relation field, but ElasticSearch has no concept of relations.

That's why we use the object field to save the whole category object, in Elasticsearch.

We specify that the category field is an object, then in the properties, we specify the fields of category which are id and title.

You might have noticed that in the category title, we use the keyword field instead of a text field. The difference is between the two is how they are analyzed in ElasticSearch. You can read more about it below.

When to use the keyword type vs text datatype in Elasticsearch | ObjectRocket
Keyword and text types may both hold string data, but these datatypes differ in important ways. This tutorial will explain the difference between the keyword vs text datatype in Elasticsearch.

Finally, we specify the name of the index to be articles.

But before we move on we have to register our indexes.

So add this code to settings.py

# settings.py

ELASTICSEARCH_INDEX_NAMES = {
    'articles.article': 'articles',
}
The format I use is app.model: index

But before we index our model, make sure to have some records in your database.

Once you have that done, run the command:

python manage.py search_index --rebuild

This will populate our articles index with data from our database.

Everything seems to work, or is it?

You might have noticed the problem, what if we add or remove records?

Will the indexes auto-update?

Unfortunately no, but we can easily automate that with Django Signals.

Signals offer hooks, code that runs after a specific event.

The hooks that we are interested in are post_save and post_delete.

So create a signals.py file in your app and add this code:

# signals.py
from django.db.models.signals import post_save, post_delete
from django.dispatch import receiver

from django_elasticsearch_dsl.registries import registry


@receiver(post_save)
def update_document(sender, **kwargs):
    app_label = sender._meta.app_label
    model_name = sender._meta.model_name
    instance = kwargs['instance']

    if app_label == 'articles':
        if model_name == 'article':
            instances = instance.article.all()
            for _instance in instances:
                registry.update(_instance)


@receiver(post_delete)
def delete_document(sender, **kwargs):
    app_label = sender._meta.app_label
    model_name = sender._meta.model_name
    instance = kwargs['instance']

    if app_label == 'articles':
        if model_name == 'article':
            instances = instance.article.all()
            for _instance in instances:
                registry.update(_instance)

This code essentially updates the indexes after creating or deleting articles.

It checks the app name and model name to make sure it gets run only for the article model.

Serializing our Document

Now that we have indexed our model, we have to find a way to serialize it.

This is where django-elasticsearch-dsl-drf comes in.

The library provides us with a DocumentSerializer class that enables us to easily serialize documents.

Create a serializers.py file in your app and add:

from django_elasticsearch_dsl_drf.serializers import DocumentSerializer

from articles.documents import ArticleDocument

class ArticleDocumentSerializer(DocumentSerializer):
    class Meta:
        document = ArticleDocument

        fields = (
            'title',
            'category'
        )

It's very similar to your usual DRF serializer, you just give it a document, and specify the fields that you want to serialize.

Creating our APIs

Finally, we have everything ready to create our APIs.

We are gonna take it step-by-step implementing things feature by feature.

Let's create a base document view, without any features.

In your views.py add:

from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from articles.serializers import ArticleDocumentSerializer

class ArticleDocumentView(DocumentViewSet):
    document = ArticleDocument
    serializer_class = ArticleDocumentSerializer

    filter_backends = []

This is a simple document view that only lists and retrieves out articles from our ElascticSearch index.

Once again, this is done very simply due to django-elasticsearch-dsl-drf

Let's implement the features.

To implement a simple search:

  1. Add the SearchFilterBackend to filter_backends
  2. Add search_fields which takes in a tuple of fields that you want to search by.
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from django_elasticsearch_dsl_drf.filter_backends import SearchFilterBackend
from articles.serializers import ArticleDocumentSerializer

class ArticleDocumentView(DocumentViewSet):
    document = ArticleDocument
    serializer_class = ArticleDocumentSerializer

    filter_backends = [
    	SearchFilterBackend
    ]
    
    search_fields = ('title',)

Filter

To implement a simple filter:

  1. Add the FilteringFilterBackend to filter_backends
  2. Add filter_fields which takes in a dictionary of fields that you want to filter by.
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from django_elasticsearch_dsl_drf.filter_backends import FilteringFilterBackend
from articles.serializers import ArticleDocumentSerializer

class ArticleDocumentView(DocumentViewSet):
    document = ArticleDocument
    serializer_class = ArticleDocumentSerializer

    filter_backends = [
    	FilteringFilterBackend
    ]
    
    filter_fields = {
        'category': 'category.id'
    }

You might be confused about this line:

'category': 'category.id'

This essentially means that I want a parameter called category that takes in an id, and it filters on the category object specifically the id field.

Auto-Complete

To implement a simple auto-complete:

  1. Add the SuggesterFilterBackend to filter_backends
  2. Add suggester_fields which takes in a dictionary of fields that you want to auto-suggest by.
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from django_elasticsearch_dsl_drf.filter_backends import SuggesterFilterBackend
from django_elasticsearch_dsl_drf.constants import SUGGESTER_COMPLETION
from articles.serializers import ArticleDocumentSerializer

class ArticleDocumentView(DocumentViewSet):
    document = ArticleDocument
    serializer_class = ArticleDocumentSerializer

    filter_backends = [
    	SuggesterFilterBackend
    ]
    
    suggester_fields = {
        'title': {
            'field': 'title.suggest',
            'suggesters': [
                SUGGESTER_COMPLETION,
            ],
        },
    }

Let's break down this piece of code:

suggester_fields = {
        'title': {
            'field': 'title.suggest',
            'suggesters': [
                SUGGESTER_COMPLETION,
            ],
        },
    }

The first rule is that we MUST auto-complete by completion field.

If you remember in our Document file, we declared a "suggest" property for the field title to be completion field.

Next, we declare what type of suggestion do we use, as of October 2021 Elasticsearch supports three types of suggesters:

  • Completion
  • Phrase
  • Term

You can read about the difference between them below.

Suggesters | Elasticsearch Guide [7.15] | Elastic

Anyways, for our use-case completion is enough.

Final Version of View

from django_elasticsearch_dsl_drf.constants import SUGGESTER_COMPLETION
from django_elasticsearch_dsl_drf.filter_backends import SearchFilterBackend, FilteringFilterBackend, SuggesterFilterBackend
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet

from articles.documents import ArticleDocument
from articles.serializers import ArticleDocumentSerializer

class ArticleDocumentView(DocumentViewSet):
    document = ArticleDocument
    serializer_class = ArticleDocumentSerializer

    filter_backends = [
        FilteringFilterBackend,
        SearchFilterBackend,
        SuggesterFilterBackend
    ]

    search_fields = (
        'title',
    )

    filter_fields = {
        'category': 'category.id'
    }

    suggester_fields = {
        'title': {
            'field': 'title.suggest',
            'suggesters': [
                SUGGESTER_COMPLETION,
            ],
        },
    

Don't forget to add it to urls.py

# urls.py
from django.contrib import admin
from django.urls import path
from rest_framework import routers

from articles.views importArticleDocumentView

router = routers.SimpleRouter(trailing_slash=False)

router.register(r'article-search', ArticleDocumentView, basename='article-search')

urlpatterns = [
    path('admin/', admin.site.urls),
]

urlpatterns += router.urls

APIs in Action

DocumentViewSet automatically creates three different APIs.

  • One for listing and retrieving records.
  • One for normal functionality (search, filter, etc...).
  • One for auto-complete functionality.
  • One for functional suggestions.

In our example, we only care about the normal and suggest functionalities.

List

URL: localhost:8000/article-search

Search

URL: localhost:8000/article-search?search=programming

If you want to search on multiple terms, you can do it using:

localhost:8000/article-search?search=term1&search=term2

You can read more about it below.

Filter usage examples — django-elasticsearch-dsl-drf 0.22.2 documentation
Search Documentation

Filter

URL: localhost:8000/article-search?category=2

There are many different ways to filter.

For example, if you want to filter by multiple categories, you can do:

localhost:8000/article-search?category=id1__id2__id3

You can read more about different filters below.

Filter usage examples — django-elasticsearch-dsl-drf 0.22.2 documentation
Filter Documentation

Auto-Suggest

URL: localhost:8000/article-search/suggest?title__completion=how

Conclusion

First of all, congratulations for making it thus far.

To wrap it up, let's review what we learned today:

  • We created a local instance of ElasticSearch using Docker and Docker-Compose.
  • We connected Django with ElasticSearch using django-elasticsearch-dsl library.
  • We indexed our models using documents.
  • We synced our models and indexes using signals.
  • We serialized our documents.
  • We created basic search, filter, and auto-complete functionalities.

If you have successfully implemented the APIs, push it on to GitHub and tweet at me @tamerlan_dev

Thanks for reading!

Member discussion