How to Integrate Elasticsearch with DRF
In this article, you will learn how to create a simple search API using Elasticsearch and Django-Rest-Framework (DRF)
Elasticsearch is the leading search engine in the world.
Many companies use it such as Uber and Slack, but to leverage this technology in Django is pretty confusing.
There are three different libraries that you have to use unless you want to implement everything from scratch.
The documentation is also pretty hard to understand, and there's isn't much written about this topic.
That's why after days of struggling, I've decided to write this article that would act as a one-stop shop for integrating ElasticSearch with Django.
PS. You can find the project on GitHub below.
Prerequisites
- Basics of Python
- Basics of Django and Django-Rest-Framework (DRF)
- Basics of ElasticSearch
- Docker or a local installation of Elasticsearch
What's the project about?
To not waste any time, I already have a Django app ready with two models.
- Articles
- Category
This is the code for the models:
from django.db import models
class Category(models.Model):
title = models.CharField(max_length=100)
class Article(models.Model):
title = models.CharField(max_length=100)
category = models.ForeignKey(
Category, related_name='category', on_delete=models.CASCADE
)
Our goal for this tutorial is to:
- Allow users to search by title using ElasticSearch.
- Allow users to filter articles by category using Elasticsearch
- Create an auto-complete API for articles.
Without further ado, let's begin.
Running an Elasticsearch Instance Locally
Before we get into Django, we have to get an Elasticsearch instance running.
If you already have a local instance running you can skip this part.
My preferred method is using Docker and Docker-Compose.
This is the Dockerfile that I use for Django:
# Dockerfile
# syntax=docker/dockerfile:1
FROM python:3
ENV PYTHONUNBUFFERED=1
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/
This is my docker-compose.yml file which has two services.
- Web
- Elasticsearch
Here's the code:
version: "3.9"
services:
web:
build: .
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/code
ports:
- "8000:8000"
depends_on:
- elasticsearch
networks:
- elastic
elasticsearch:
image: elasticsearch:7.14.0
volumes:
- ./data/elastic:/var/lib/elasticsearch/data
environment:
- discovery.type=single-node
ports:
- 9200:9200
networks:
- elastic
networks:
elastic:
driver: bridge
Once you have these two files on your root directory, you can run the containers.
docker-compose up -d
Now you should be able to go to localhost:8000 and everything should work.
Integrating Elasticsearch with Django
Integration with Elasticsearch is pretty simple, you only need one library.
It's called django-elasticsearch-dsl, so let's install that.
pip install django-elasticsearch-dsl
Keep in mind that there are different versions of the library for different versions of ElasticSearch.
As of November 2021, these are the latest requirements:
- For Elasticsearch 7.0 and later, use the major version 7 (7.x.y) of the library.
- For Elasticsearch 6.0 and later, use the major version 6 (6.x.y) of the library.
- For Elasticsearch 5.0 and later, use the major version 0.5 (0.5.x) of the library.
Once you have the library installed, make sure to add it to your installed_apps
in settings.py
# settings.py
INSTALLED_APPS = [
# Other apps
'django_elasticsearch_dsl',
]
Finally, we simply have to tell django-elastsearch-dsl where is our ElasticSearch instance located.
Add this to settings.py
:
# settings.py
# other code....
ELASTICSEARCH_DSL = {
'default': {
'hosts': 'elasticsearch:9200'
},
}
Here we are saying that ElasticSearch is located in elasticsearch:9200, this is equivalent to localhost:9200 but we use the service name when communicating between two docker-compose services.
At this point, everything should be working as usual.
Integrating Elasticsearch with DRF
Now that we have integrated ElasticSearch with Django, we have to integrate it with DRF.
For that, we use another library called django-elasticsearch-dsl-drf
pip install django-elasticsearch-dsl-drf
We have to add it to our installed_apps
in settings.py
:
# settings.py
INSTALLED_APPS = [
# Other apps
'django_elasticsearch_dsl',
'django_elasticsearch_dsl_drf',
]
That's it, we have fully integrated Elasticsearch with DRF.
Let's create some APIs.
Indexing our Model into Elasticsearch
Elasticsearch is working, but it's empty.
We have to tell Django to copy our existing records into ElasticSearch.
ElasticSearch stores data in JSON documents.
This means that for every model that we want to add to ElasticSearch we have to create a document class for it.
This will be stored in a documents.py
file in your apps folder.
Let's create a document for our article model:
# articles/documents.py
from django_elasticsearch_dsl import Document, fields
from django_elasticsearch_dsl.registries import registry
from articles.models import Article
@registry.register_document
class ArticleDocument(Document):
title = fields.TextField(
attr='title',
fields={
'raw': fields.TextField(),
'suggest': fields.CompletionField(),
}
)
category = fields.ObjectField(
attr='category',
properties={
'id': fields.IntegerField(),
'title': fields.TextField(
attr='title',
fields={
'raw': fields.KeywordField(),
}
)
}
)
class Index:
name = 'articles'
class Django:
model = Article
This may seem complicated but we simply registered a document called ArticleDocument
that is linked to the model Article
. We then specified the fields that we want to index in ElasticSearch.
The first field is the title, which is a TextField
with two properties.
- raw – This is the normal ElasticSearch text field that we will use for search functionality.
- suggest – This is a completion field that is used for auto-complete functionality.
The next field is the category which is a relation field, but ElasticSearch has no concept of relations.
That's why we use the object field to save the whole category object, in Elasticsearch.
We specify that the category field is an object, then in the properties, we specify the fields of category which are id and title.
You might have noticed that in the category title, we use the keyword field instead of a text field. The difference is between the two is how they are analyzed in ElasticSearch. You can read more about it below.
Finally, we specify the name of the index to be articles.
But before we move on we have to register our indexes.
So add this code to settings.py
But before we index our model, make sure to have some records in your database.
Once you have that done, run the command:
python manage.py search_index --rebuild
This will populate our articles index with data from our database.
Everything seems to work, or is it?
You might have noticed the problem, what if we add or remove records?
Will the indexes auto-update?
Unfortunately no, but we can easily automate that with Django Signals.
Signals offer hooks, code that runs after a specific event.
The hooks that we are interested in are post_save
and post_delete
.
So create a signals.py
file in your app and add this code:
# signals.py
from django.db.models.signals import post_save, post_delete
from django.dispatch import receiver
from django_elasticsearch_dsl.registries import registry
@receiver(post_save)
def update_document(sender, **kwargs):
app_label = sender._meta.app_label
model_name = sender._meta.model_name
instance = kwargs['instance']
if app_label == 'articles':
if model_name == 'article':
instances = instance.article.all()
for _instance in instances:
registry.update(_instance)
@receiver(post_delete)
def delete_document(sender, **kwargs):
app_label = sender._meta.app_label
model_name = sender._meta.model_name
instance = kwargs['instance']
if app_label == 'articles':
if model_name == 'article':
instances = instance.article.all()
for _instance in instances:
registry.update(_instance)
This code essentially updates the indexes after creating or deleting articles.
It checks the app name and model name to make sure it gets run only for the article model.
Serializing our Document
Now that we have indexed our model, we have to find a way to serialize it.
This is where django-elasticsearch-dsl-drf comes in.
The library provides us with a DocumentSerializer class that enables us to easily serialize documents.
Create a serializers.py
file in your app and add:
from django_elasticsearch_dsl_drf.serializers import DocumentSerializer
from articles.documents import ArticleDocument
class ArticleDocumentSerializer(DocumentSerializer):
class Meta:
document = ArticleDocument
fields = (
'title',
'category'
)
It's very similar to your usual DRF serializer, you just give it a document, and specify the fields that you want to serialize.
Creating our APIs
Finally, we have everything ready to create our APIs.
We are gonna take it step-by-step implementing things feature by feature.
Let's create a base document view, without any features.
In your views.py add:
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from articles.serializers import ArticleDocumentSerializer
class ArticleDocumentView(DocumentViewSet):
document = ArticleDocument
serializer_class = ArticleDocumentSerializer
filter_backends = []
This is a simple document view that only lists and retrieves out articles from our ElascticSearch index.
Once again, this is done very simply due to django-elasticsearch-dsl-drf
Let's implement the features.
Search
To implement a simple search:
- Add the
SearchFilterBackend
tofilter_backends
- Add
search_fields
which takes in a tuple of fields that you want to search by.
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from django_elasticsearch_dsl_drf.filter_backends import SearchFilterBackend
from articles.serializers import ArticleDocumentSerializer
class ArticleDocumentView(DocumentViewSet):
document = ArticleDocument
serializer_class = ArticleDocumentSerializer
filter_backends = [
SearchFilterBackend
]
search_fields = ('title',)
Filter
To implement a simple filter:
- Add the
FilteringFilterBackend
tofilter_backends
- Add
filter_fields
which takes in a dictionary of fields that you want to filter by.
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from django_elasticsearch_dsl_drf.filter_backends import FilteringFilterBackend
from articles.serializers import ArticleDocumentSerializer
class ArticleDocumentView(DocumentViewSet):
document = ArticleDocument
serializer_class = ArticleDocumentSerializer
filter_backends = [
FilteringFilterBackend
]
filter_fields = {
'category': 'category.id'
}
You might be confused about this line:
'category': 'category.id'
This essentially means that I want a parameter called category that takes in an id, and it filters on the category object specifically the id field.
Auto-Complete
To implement a simple auto-complete:
- Add the
SuggesterFilterBackend
tofilter_backends
- Add
suggester_fields
which takes in a dictionary of fields that you want to auto-suggest by.
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from django_elasticsearch_dsl_drf.filter_backends import SuggesterFilterBackend
from django_elasticsearch_dsl_drf.constants import SUGGESTER_COMPLETION
from articles.serializers import ArticleDocumentSerializer
class ArticleDocumentView(DocumentViewSet):
document = ArticleDocument
serializer_class = ArticleDocumentSerializer
filter_backends = [
SuggesterFilterBackend
]
suggester_fields = {
'title': {
'field': 'title.suggest',
'suggesters': [
SUGGESTER_COMPLETION,
],
},
}
Let's break down this piece of code:
suggester_fields = {
'title': {
'field': 'title.suggest',
'suggesters': [
SUGGESTER_COMPLETION,
],
},
}
The first rule is that we MUST auto-complete by completion field.
If you remember in our Document file, we declared a "suggest" property for the field title to be completion field.
Next, we declare what type of suggestion do we use, as of October 2021 Elasticsearch supports three types of suggesters:
- Completion
- Phrase
- Term
You can read about the difference between them below.
Anyways, for our use-case completion is enough.
Final Version of View
from django_elasticsearch_dsl_drf.constants import SUGGESTER_COMPLETION
from django_elasticsearch_dsl_drf.filter_backends import SearchFilterBackend, FilteringFilterBackend, SuggesterFilterBackend
from django_elasticsearch_dsl_drf.viewsets import DocumentViewSet
from articles.documents import ArticleDocument
from articles.serializers import ArticleDocumentSerializer
class ArticleDocumentView(DocumentViewSet):
document = ArticleDocument
serializer_class = ArticleDocumentSerializer
filter_backends = [
FilteringFilterBackend,
SearchFilterBackend,
SuggesterFilterBackend
]
search_fields = (
'title',
)
filter_fields = {
'category': 'category.id'
}
suggester_fields = {
'title': {
'field': 'title.suggest',
'suggesters': [
SUGGESTER_COMPLETION,
],
},
Don't forget to add it to urls.py
# urls.py
from django.contrib import admin
from django.urls import path
from rest_framework import routers
from articles.views importArticleDocumentView
router = routers.SimpleRouter(trailing_slash=False)
router.register(r'article-search', ArticleDocumentView, basename='article-search')
urlpatterns = [
path('admin/', admin.site.urls),
]
urlpatterns += router.urls
APIs in Action
DocumentViewSet automatically creates three different APIs.
- One for listing and retrieving records.
- One for normal functionality (search, filter, etc...).
- One for auto-complete functionality.
- One for functional suggestions.
In our example, we only care about the normal and suggest functionalities.
List
URL: localhost:8000/article-search
Search
URL: localhost:8000/article-search?search=programming
If you want to search on multiple terms, you can do it using:
localhost:8000/article-search?search=term1&search=term2
You can read more about it below.
Filter
URL: localhost:8000/article-search?category=2
There are many different ways to filter.
For example, if you want to filter by multiple categories, you can do:
localhost:8000/article-search?category=id1__id2__id3
You can read more about different filters below.
Auto-Suggest
URL: localhost:8000/article-search/suggest?title__completion=how
Conclusion
First of all, congratulations for making it thus far.
To wrap it up, let's review what we learned today:
- We created a local instance of ElasticSearch using Docker and Docker-Compose.
- We connected Django with ElasticSearch using django-elasticsearch-dsl library.
- We indexed our models using documents.
- We synced our models and indexes using signals.
- We serialized our documents.
- We created basic search, filter, and auto-complete functionalities.
If you have successfully implemented the APIs, push it on to GitHub and tweet at me @tamerlan_dev
Thanks for reading!