Understanding the Dockerfile VOLUME statement

0
40


Docker volumes are used to store persistent data outside of their containers. They allow configuration files, databases, and caches used by your application to survive individual container instances.

Volumes can be mounted when you start containers with the docker run of command -v flag. This can refer to a named volume or mount a host directory on the container’s file system.

It is also possible to define volumes at the time of image construction using the VOLUME instruction in your Dockerfiles. This mechanism ensures that containers booted from the image will have persistent storage available. In this article, you’ll learn how to use this statement and the use cases where it makes sense.

Defining Volumes in Dockerfiles

The Dockerfile VOLUME The statement creates a volume mount point at a specified container path. A volume will be mounted from your Docker host’s file system each time a container is started.

The Dockerfile in the following example defines a volume on the /opt/app/data container path. New containers will automatically mount a volume to the directory.

FROM ubuntu:22.04
VOLUME /opt/app/data

Create your image so you can test the volume mount:

$ docker build -t volumes-test:latest .

Retrieve the list of existing volumes for reference:

$ docker volume ls
DRIVER   VOLUME NAME
local    demo-volume

Now start a container using your test image:

$ docker run -it volume-test:latest
[email protected]:/#

Repeat the docker volume ls command to confirm that a new volume has been created:

$ docker volume ls
DRIVER   VOLUME NAME
local    3198bf857fdcbb8758c5ec7049f2e31a40b79e329f756a68725d83e46976b7a8
local    demo-volume

Break out of the shell of your test container to make the container stop:

[email protected]:/# exit
exit

The volume and its data will still persist:

$ docker volume ls
DRIVER   VOLUME NAME
local    3198bf857fdcbb8758c5ec7049f2e31a40b79e329f756a68725d83e46976b7a8
local    demo-volume

You can define multiple volumes in one statement as a space-delimited string or a JSON array. The following two forms create and mount two unique volumes when the containers are started:

VOLUME /opt/app/data /opt/app/config
# OR
VOLUME ["/opt/app/data", "/opt/app/config"]

Fill the contents of the initial volume

Volumes are automatically populated with the content placed in the mount directory using the imaging steps above:

FROM ubuntu:22.04
COPY default-config.yaml /opt/app/config/default-config.yaml
VOLUME /opt/app/config

This Dockerfile defines a volume that will be initialized with the existing one default-config.yaml proceedings. The container will be able to read /opt/app/config/default-config.yaml without having to check if the file exists.

Changes to the contents of a volume made after the VOLUME the instruction will be discarded. In this example, the default-config.yaml the file is still available after the containers are started because the rm the command comes after /opt/app/config is marked as a volume.

FROM ubuntu:22.04
COPY default-config.yaml /opt/app/config/default-config.yaml
VOLUME /opt/app/config
RUN rm /opt/app/config/default-config.yaml

Overriding VOLUME statements when starting a container

Volumes created by VOLUME instruction are automatically named with a unique long hash. Their names cannot be changed, so it can be difficult to identify which volumes are actively being used by their containers.

You can prevent these volumes from appearing by manually defining the volumes in your containers with docker run -v as usual. The following command explicitly mounts a named volume in the container /opt/app/config directory, making the Dockerfile VOLUME redundant instruction.

$ docker run -it -v config:/opt/app/config volumes-test:latest

When should you use VOLUME statements?

VOLUME The statements can be useful in situations where you want to enforce persistence, such as in images that package a database server or file store. Wearing VOLUME instructions makes it easier to start containers without remembering the -v flags to apply.

VOLUME it also serves as documentation of the container paths that store persistent data. Including these instructions in your Dockerfile allows anyone to determine where your container stores their data, even if they are not familiar with your application.

VOLUME traps

VOLUME it is not without drawbacks. Your biggest problem is how you interact with image constructs. Use an image with a VOLUME statement as the base image of your build will behave unexpectedly if you change the content inside the volume’s mount point.

The gotcha from before still applies: the effects of the commands after the VOLUME the instruction will be discarded. What VOLUME will reside in the base image, everything in its own Dockerfile comes after the instruction, and you can’t modify the default contents of the directory. Behind the scenes, starting the temporary container for the build will create a new volume on your host that will be destroyed at the end of the build. The changes will not be copied back to the output image.

Volume automount can also be problematic in other situations. Sometimes users may prefer to start a temporary container without any volumes, perhaps for evaluation or debugging purposes. VOLUME eliminates this possibility since it is not possible to disable automatic mounts. This causes many redundant volumes to accumulate on the host if the containers that use the statement are started regularly.

Summary

docker file VOLUME The instructions allow volume mounts to be defined at image build time. They guarantee that containers booted from the image will have persistent data storage available, even if the user omits the docker run of command -v flag.

This behavior can be useful for images where persistence is critical or many volumes are needed. However the VOLUME The instruction also breaks some user expectations and introduces unique behaviors, so it needs to be written carefully. Providing a Docker Compose file that automatically creates the required volumes is often a better solution.