Docker - Volumes - Persistent Storage

TODO…

The approach that seems to work best for production is to use a data only container.

The data only container is run on a barebone image and actually does nothing except exposing a data volume.

Then you can run any other container to have access to the data container volumes:

docker run --volumes-from data-container some-other-container command-to-execute

Here you can get a good picture of how to arrange the different containers.
Here there is a good insight on how volumes work.

UPDATE:

In this blog post there is a good description of the so called container as volume pattern which clarifies the main point of having data only containers.

UPDATE 2:

Docker documentation has now the DEFINITIVE description of the container as volume/s pattern.

UPDATE 3:

Updated docs with backup/restore procedure

BACKUP:

sudo docker run --rm --volumes-from DATA -v $(pwd):/backup busybox tar cvf /backup/backup.tar /data

–rm: remove the container when it exits
–volumes-from DATA: attach to the volumes shared by the DATA container
-v $(pwd):/backup: bind mount the current directory into the container; to write the tar file to
busybox: a small simpler image - good for quick maintenance
tar cvf /backup/backup.tar /data: creates an uncompressed tar file of all the files in the /data directory

RESTORE:

# create a new data container
$ sudo docker run -v /data -name DATA2 busybox true
# untar the backup files into the new container᾿s data volume
$ sudo docker run --rm --volumes-from DATA2 -v $(pwd):/backup busybox tar xvf /backup/backup.tar
data/
data/sven.txt
# compare to the original container
$ sudo docker run --rm --volumes-from DATA -v `pwd`:/backup busybox ls /data
sven.txt

UPDATE 4

A nice article from the excellent Brian Goff explaining why it is good to use the same image for a container and a data container.

UPDATE 5

Docker 1.9.0 has new volume API!

docker volume create --name hello
docker run -d -v hello:/container/path/for/volume container_image my_command

this means that the data only container pattern must be abandoned in favour of the new volumes.

Actually the volume API is only a better way to achieve what was the data-container pattern.

If you create a container with a -v volume_name:/container/fs/path docker will automatically create a named volume for you that can:

Be listed through the docker volume ls.
Be identified through the docker volume inspect volume_name.
Backed up as a normal dir.
Backed up as before through a --volumes-from connection.

The new volume api adds a useful command that let you identify dangling volumes:

docker volume ls -f dangling=true

And then remove it through its name:

docker volume rm <volume name>

References

http://stackoverflow.com/questions/18496940/how-to-deal-with-persistent-storage-e-g-databases-in-docker?rq=1

http://txt.fliglio.com/2013/11/creating-a-mysql-docker-container/