Hacks by Ruddra

Docker: Tips on Writing Dockerfile, Reduce Sizes and Build Time of Images

Docker: Tips on Writing Dockerfile, Reduce Sizes and Build Time of Images

For the past couple of years, I have been creating Dockerfile for different projects. based on those experiences I am going to share some tips on writing docker files to communication between containers.

FYI: before reading this article, please read the article from official docker blog site to learn about best practices for creating Dockerfile.

Look for official Docker Images

When you are trying to run a project in Docker, its better to use official images rather than writing your own. If you need to install different packages that does not come with the official image, then you can extend the official images. For example, if you want to install Flask in your Docker Container, instead of creating an Image from Ubuntu or Alpine Linux, use Python’s official image:

FROM python:3.7-alpine

RUN pip install flask

Reduce the Size of Docker Image

Reducing size of a docker image is very important. Because before you know it, your image might take 1-3 GB of space. There are different ways you can reduce it. For example:

#1 Use Alpine Base

For most of the official docker images, there is a alpine varient. Alpine varients are much more light weight then their ubuntu/debian counter parts. Also, alpine base image is only 5MB.

#2 Delete packages after dependent applications are installed

Lets say you want to install psycopg2 python package in your docker, for that you need to add packages like postgres-dev, python-dev, musl-dev etc. But those packages won’t be needed once psycopg2 is installed, so its better to delete them. For that, in alpine you can use the following code:

RUN apk update \
    && apk add --virtual build-deps gcc python3-dev musl-dev \
    && apk add postgresql-dev \
    && pip install psycopg2 \
    && apk del build-deps

Explanation: here I am installing the packages using --virtual build-deps tag, which means those packages will be installed build-deps directory. After psycopg2 is installed, I can simply delete it.

#3 Remove cache

Remove cache from alpine image by adding --no-cache at the end of apk add. But down side is that each time you build the image, your packages will be installed again.

Probably you can do the same in Ubuntu or debian based images using -- no-install-recommends on apt-get install.

You can do the same thing with package manager, for example in pip:

RUN pip install --no-cache-dir django

#4 Reduce layers in Dockerfile

Reduce multiple lines into one if possible. Because each layer takes space in image. For example, reduce the following:

RUN apk add gcc
RUN apk add python3-dev

To this one:

RUN apk add gcc python3-dev

#5 No need to install debug tools

You don’t need debug tools like cURL or Vim in Docker(at least not in production). Install only the necessary packages.

Reduce Build Time by Caching Layers

Sometimes its better to cache layers which takes long time to install or download. For example:

COPY /PLUGIN_DIR/pom.xml .
RUN mvn dependency:go-offline
RUN mvn package

In this code, I use an extra layer RUN mvn dependency:go-offline. When the image is built for the first time, this layer executes to resolve dependencies. But consecutive builds after that, they don’t execute this layer as its alredy cached. It reduces build time significantly.

Here is another example:

RUN pip3 install \
    pandas==0.25.2 \
    numpy==1.17.3 \
    psycopg2==2.8.4 \
    gensim==3.8.1 

RUN pip3 install -r requirements.pip

Here you can see I am using extra layer to install pandas, numpy, scipy etc, and that layer will be cached after first build.

Thats it for now. I will add more stuff in future in this article or future articles. So stay tuned :) . Finally, feel free to share your feedback in comment section below.

Cheers!!

Relevent Posts

References

Share Your Thoughts
M ↓   Markdown