Distributing confidential Docker images

Here’s my another pet peeve with Docker: the infrastructure for distributing images is simply Not There Yet. What do we have now? There’s a public image index (I still don’t fully get the distinction of index vs registry, but it looks like a way for DotCloud to have some centralized service that’s needed also for private images). I can run my own registry, either keeping the access completely open (with access limited only by IP or network interface), or delegating the authentication to DotCloud’s central index. Even if I choose to authenticate against the index, there doesn’t seem to be any way to actually limit access to the registry — it looks like anyone who has an index account and HTTP(S) access to the registry can download or push images.

It doesn’t seem there is any way in the protocol to authenticate users against anything that’s not the central index – not even plain http auth. Just to get https access, I need to put Apache or nginx in front of the registry. And did I mention that there is no way to move full images between Docker hosts without a registry, not even a tarball export?

I fully understand that Docker is still in development, and that these problems mean that there is not much of bigger showstopper issues, which is actually good. However, this seems to seriously limit usefulness of Docker in production environments; I need to either stop controlling who’s able to download my images, or I need to build image locally on each Docker host — which prevents me from building an image once, testing it, and then using the very same image everywhere.

And the problem with distribution is not only with distributing in-house confidential software. A lot of open source projects run on Java (off the top of my head: Jenkins, RunDeck, Logstash + Elasticsearch, almost anything from Apache Software Foundation…). While I support OpenJDK with all my heart, Oracle’s JVM still wins in terms of performance and reliability; and it’s not allowed to distribute Oracle’s JVM except internally within an organization. I may also want to keep my Docker images partially configured – software is open, but I’d prefer not to publish internal passwords, access keys, or IP numbers.

I hope that in the long run, it will be possible to exchange images in different ways (plain old rsync, distribution via bittorrent, git-annex network, shared filesystems… I could go on and on). Right now, I found only one way, and it doesn’t seem obvious, so I want to share it. Here it is:

Docker’s registry server doesn’t keep any local data; all it knows is in its storage backend (an on-disk directory, or an Amazon S3 bucket). This means it’s possible to run the registry locally (on 127.0.0.1), and move access control to the storage backend; you don’t control Docker’s access to the registry, but registry’s access to the storage. It may be implemented either as a shared filesystem (GlusterFS, or even NFS), it may be an automatically synced directory on disk, or – which is what I prefer – a shared S3 bucket. Each Docker host runs its own registry, attached to the same bucket, with a read-only key pair (to make sure it won’t be able to overwrite tags or push images). The central server that is allowed to build and tag images is the only one that has write access. Images stay confidential, and there even is a crude access control (read-only vs write access). It’s not the best performance you can get to distribute the images, but it gets the job done until there’s a more direct way to export/import a whole image.

I hope this approach is useful; have fun with it!