Running Elasticsearch in a Docker Container


The current project I'm working on has a heavy dependency on Elasticsearch and I thought it would be good to walk through my setup using Docker. I wanted to manage Elasticsearch using Docker for a few reasons:

  1. I usually have multiple projects running on my laptop. I love the idea of being able to manage services on a per project basis. I find it a lot easier to manage the service using Docker rather than installing everything on my local O/S and then worry about starting/stoping the services.
  2. The other reason is so I can deploy the service in the same environment as my laptop. I don't have to worry about doing any installations as it will all be taken care of with my Dockerfiles.

There are plenty of references on the web about setting up services using Docker, but I figured I would walk you through what I've been doing and how it's been working for me.

Dockerfile

Before we get started I am using Elasticsearch 1.7 as you can see in my Docker file. I have not upgraded to 2.x and know my configure will have to change as Elasticsearch can no longer run as root. I want to keep as much control over the Docker install as I can so I built my own Dockerfile based on the official elasticsearch Dockerfile. I also wanted to build my own Dockerfile instead of using the Elasticsearch Dockerfile to ensure I can easily upgrade my Elasticsearch instance when I wanted. Since I've started the project I have upgraded 3 times. This has been very painless as I am still in development and it is easy to keep things moving. I haven't made many changes to the base file. Below is my Dockerfile for Elasticsearch.

##
#
# Dockerfile for our Elastic Search cluster.  The dockerfile is based off of
#  https://github.com/dockerfile/elasticsearch.
#

# Pull base image.
FROM ubuntu:14.04

ENV ES_PKG_NAME elasticsearch-1.7.3

# Install system level packages needed to install elastic search or dependancies.
RUN apt-get -y install software-properties-common

# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 \
    select true | debconf-set-selections && \
    add-apt-repository -y ppa:webupd8team/java && \
    apt-get update && \
    apt-get install -y oracle-java8-installer && \
    rm -rf /var/lib/apt/lists/* && \
    rm -rf /var/cache/oracle-jdk8-installer

# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle

# Install Elasticsearch.
RUN \
    cd / && \
    wget https://download.elasticsearch.org/elasticsearch/elasticsearch/$ES_PKG_NAME.tar.gz && \
    tar xvzf $ES_PKG_NAME.tar.gz && \
    rm -f $ES_PKG_NAME.tar.gz && \
    mv /$ES_PKG_NAME /elasticsearch

# Mount elasticsearch.yml config
ADD config/elasticsearch.yml /elasticsearch/config/elasticsearch.yml

# Expose ports.
#   - 9200: HTTP
#   - 9300: transport
EXPOSE 9200
EXPOSE 9300

As you can see it is pretty much the same thing as the official Elasticsearch Dockerfile with a few very minor tweaks like the Elasticsearch version.

Docker Compose

While in development I use docker compose to help manage the Docker container. It's a great tool to help you manage and control your containers. It becomes really useful when you are managing more than one container at a time. My docker-compose.yml file is very basic and sets the image, working directory, startup command, volumes and ports. I like to keep attributes that might change in my Docker compose file so I can make changes without having to rebuild the container.

elasticsearch:
  build: elasticsearch
  working_dir: /data
  command: /elasticsearch/bin/elasticsearch -Des.config=/data/config/elasticsearch.yml
  volumes:
    - /var/lib/docker/<app_name>/elasticsearch/data:/data
    - ./elasticsearch/config:/data/config
  ports:
    - "9200:9200"
    - "9300:9300"

Docker recommends using docker-compose in development and staging environments but not production.

Elasticsearch Settings

One of the more important parts of the Dockerfile is making sure I have mounted my Elasticsearch configuration file. I've made a few small changes to the Elasticsearch config file. I recommend reading this article from loggly's blog as well as this optimization article on the Elasticsearch site. I found both of these extremely useful when configuring my server. I've simply set my cluster name details and made a few minor configuration tweaks. Since I haven't deployed to production and haven't been able to monitor performance I just wanted to make some of the changes recommended in the above articles. The configuration file will evolve as I deploy to production and am able to look at the server performance and metrics.

path:
  data: /data/data
  logs: /data/log
  plugins: /data/plugins
  work: /data/work

# Cluster information
cluster.name: <application_name>
node.name: <application_name>

# Security settings
script.disable_dynamic: true

# Index settings
action.auto_create_index: false

# Turn off swap to get a big speed increase.  This will prevent the ES server
# from swapping memory on the node. https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html#setup-configuration-memory
bootstrap.mlockall: true

# Disable deleting all indices from the api.
action.disable_delete_all_indices: true

indices.cluster.send_refresh_mapping: false

 

Final Thoughts

Overall I've found working with Elasticsearch and Docker to be really helpful. I love the fact that I can easily start and stop the service and have it isolated on my local machine. Doing full re-indexes and upgrades to Elasticsearch has become a lot easier with very minimal work.