Persistent Storage in Docker: Volumes, Bind Mounts, and tmpfs

Persistent Storage in Docker: Volumes, Bind Mounts, and tmpfs

Introduction

As developers, we often run stateless containers in development or production, where containers can be destroyed and recreated without any loss of state. However, many real-world applications require stateful behavior—whether for databases, file storage, or logs—that persists across container restarts, re-deployments, or updates. This is where Docker volumes and persistent storage come into play.

In this blog, we’ll take a deep dive into Docker volumes, covering how they work, different types of storage options, when and how to use them, and advanced topics like volume management and data persistence in multi-container applications.


Why Persistent Storage?

Docker containers are ephemeral by nature. When a container is destroyed, its filesystem is also destroyed. This stateless behavior is excellent for immutable application deployments, but for many cases—like running a database, saving logs, or storing uploaded files—state persistence is essential.

Without persistent storage, every time your container is recreated, any data it wrote during its previous lifecycle is lost. For example:

  • If you're running a MySQL container, the database would be wiped out upon container termination.

  • If you're storing logs or user-uploaded content, those files would disappear when the container is stopped.

By leveraging persistent storage, we can ensure that our data persists beyond the lifecycle of individual containers.


Types of Persistent Storage in Docker

Docker provides three primary ways to manage persistent data:

  1. Volumes: Managed by Docker and stored in a specific directory on the host.

  2. Bind mounts: Link a directory on the host directly to a container.

  3. tmpfs mounts: Store data in the container’s memory, and it is not written to the host.

1. Volumes

Volumes are the preferred method for persisting data in Docker. Docker manages volumes entirely, and they are decoupled from the host filesystem, making them portable and flexible.

Key Features of Docker Volumes:
  • Volumes are stored outside the container's writable layer, so they remain intact even if the container is deleted.

  • Docker takes care of the management and mounting of volumes, offering an abstraction over the host filesystem.

  • Volumes work on both Linux and Windows environments.

  • They can be shared across multiple containers.

Creating and Using Volumes

You can create and manage Docker volumes using the following commands:

  • Creating a Volume:

      docker volume create myvolume
    
  • Listing Volumes:

      docker volume ls
    
  • Inspecting a Volume:

      docker volume inspect myvolume
    
  • Attaching a Volume to a Container: You can mount a volume when running a container using the -v or --mount flag.

      docker run -d -v myvolume:/data my-container
    

In this example, the myvolume volume is mounted inside the container at /data. Any changes made to files in the /data directory will persist across container restarts.

Volume Lifecycle Management

Volumes are persistent by default, even after the container using them is removed. However, if you want to delete a volume when it’s no longer needed, you can do so manually:

  • Removing a Volume:

      docker volume rm myvolume
    
  • Removing all Unused Volumes:

      docker volume prune
    

This will remove all volumes not currently in use by any container, freeing up disk space.


2. Bind Mounts

Bind mounts are an alternative to volumes, allowing you to mount a directory from the host filesystem directly into a container. This gives the container access to files on the host, making it especially useful in development environments when you want to share code or configuration files between the host and container.

Key Differences from Volumes:
  • Bind mounts have no abstraction; the path you specify is mounted directly into the container.

  • The directory must exist on the host before you mount it.

  • Bind mounts are less portable and flexible than volumes, as they are tied to specific paths on the host machine.

Creating a Bind Mount:

You can create a bind mount by specifying the full path on the host that should be mounted into the container:

docker run -d --mount type=bind,source=/path/on/host,target=/app my-container

In this example, the /path/on/host directory is mounted into the container at /app. Any changes made to files in /app are directly reflected in /path/on/host, and vice versa.


3. tmpfs Mounts

tmpfs mounts are used to store data in memory rather than on disk. This means that any data written to a tmpfs mount is lost when the container stops. tmpfs mounts are useful when you need temporary data storage that’s fast and doesn't need to persist after the container shuts down, such as for sensitive information or caching.

Using tmpfs Mounts:
docker run -d --mount type=tmpfs,tmpfs-size=64M,tmpfs-mode=1777,target=/cache my-container

In this example, a tmpfs mount of size 64 MB is created at /cache, and the data written to it will be stored in the container’s memory.


Advanced Topics in Docker Volumes

1. Sharing Volumes Between Multiple Containers

In a typical microservices architecture, multiple containers may need to share access to the same data. Docker volumes provide an easy way to share files and directories between containers.

For example, let’s say you have two containers, one for generating data and one for consuming it. You can use a shared volume between them:

docker run -d --name producer -v sharedvolume:/data producer-container
docker run -d --name consumer -v sharedvolume:/data consumer-container

In this case, both the producer and consumer containers can access the data stored in /data, allowing them to share state without needing to pass data over the network.

2. Volume Drivers

Volume drivers allow you to store data outside of the Docker host, such as in cloud storage solutions or network file systems. This is useful for highly available, distributed applications that require persistent storage across multiple hosts.

Some common volume drivers include:

  • local: The default driver, which stores volumes on the local filesystem of the Docker host.

  • nfs: A network file system driver for sharing volumes between multiple hosts.

  • glusterfs: A distributed filesystem driver for high availability.

Using a Volume Driver:
docker volume create --driver nfs --opt type=nfs --opt o=addr=host.docker.internal,rw --opt device=:/data nfs-volume

This example mounts an NFS share as a Docker volume, allowing multiple containers on different hosts to access the same data.


Ensuring Data Persistence Across Container Restarts

Ensuring data persistence across container restarts or even host failures is essential in production environments, especially when running databases or any service that writes to disk.

Let’s look at an example of how to persist data for a MySQL container:

docker run -d \
  --name mysql \
  -e MYSQL_ROOT_PASSWORD=root \
  -v mysql_data:/var/lib/mysql \
  mysql:latest

In this example, the mysql_data volume is mounted at /var/lib/mysql, the directory where MySQL stores its data. Even if the container is stopped and removed, the volume will retain the database files, ensuring that your data is not lost.


Backing Up and Restoring Docker Volumes

In production, data integrity is critical, so understanding how to back up and restore volumes is an important skill.

Backing Up a Volume:

To back up a Docker volume, you can use the docker run command to create a tar archive of the volume contents:

docker run --rm -v mysql_data:/data -v $(pwd):/backup busybox tar cvf /backup/mysql_data_backup.tar /data

This command mounts the mysql_data volume to /data inside a busybox container, then creates a tarball of the volume’s contents and stores it in the current directory on the host.

Restoring a Volume:

To restore a backup, you can use a similar command to extract the tar archive back into a new volume:

docker run --rm -v mysql_data:/data -v $(pwd):/backup busybox tar xvf /backup/mysql_data_backup.tar -C /data

This command extracts the backup tarball into the mysql_data volume, restoring the data to its original location.


Cleaning Up Unused Volumes

Docker volumes can accumulate over time, especially when creating and destroying containers during development. To free up disk space, it’s important to periodically clean up unused volumes.

Removing Unused Volumes:

docker volume prune

This command removes all volumes that are not currently being used by any containers. Be careful with this command, as it will permanently delete data stored in unused volumes.


Conclusion

In this blog, we’ve explored the intricacies of persistent storage in Docker, focusing on Docker volumes, bind mounts, and tmpfs mounts. We’ve also covered advanced topics such as sharing volumes between containers, using volume drivers for distributed storage, and backing up/restoring volumes in production environments.

Docker volumes provide a powerful and flexible way to manage persistent data in containerized applications. By understanding the different storage options and best practices for managing volumes, you can build stateful applications that are resilient, scalable, and easy to manage.

In the next blog, we’ll dive into Docker Compose and how it simplifies the orchestration of multi-container applications.


Key Takeaways for Developers:

  • Use Docker Volumes for data persistence, as they are managed by Docker and can be easily shared across containers.

  • Bind Mounts are more flexible but less portable, as they tie containers to specific host directories.

  • tmpfs Mounts provide fast, in-memory storage for temporary data.

  • Always back up critical volumes and clean up unused volumes to maintain storage efficiency.

Did you find this article valuable?

Support Bit Fetch by becoming a sponsor. Any amount is appreciated!