Basic

Specification of image:

remote-dockerhub.com/Namespace/bar:latest
        ^               ^       ^   ^
        |               |       |   |
       hub         namespace  repo  tag

You can use commands below to save/load images:

docker save -o busybox.tar busybox
docker load -i busybox.tar

Lifecycle of one image:

Screen Shot 2018-11-17 at 9.13.24 PM.png

Structure of Image

Docker image is a read-only template and supplies rootfs for container.

One image is made up of

  • meta-data (JSON)
  • data (image layers)

My environment:

root@ubuntu:~# uname -a

Linux ubuntu 4.15.0-29-generic #31~16.04.1-Ubuntu \
SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

root@ubuntu:~# docker version

Client:
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:48:57 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.0
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       4d60db4
  Built:            Wed Nov  7 00:16:44 2018
  OS/Arch:          linux/amd64
  Experimental:     false

root@ubuntu:~# docker info | grep Storage
Storage Driver: overlay2

So the storage driver is overlay2. The following is from Use the OverlayFS storage driver#How the overlay2 driver works and Docker存储驱动—Overlay/Overlay2「译」.

Let’s pull a image of nginx:

root@ubuntu:~# docker pull nginx

Using default tag: latest
latest: Pulling from library/nginx
a5a6f2f73cd8: Pull complete
67da5fbcb7a0: Pull complete
e82455fa5628: Pull complete
Digest: sha256:31b8e90a349d1fce7621f5a5a08e4fc519b634f7d3feb09d53fac9b12aa4d991
Status: Downloaded newer image for nginx:latest

Then see the overview:

root@ubuntu:~# cat /var/lib/docker/image/overlay2/repositories.json | python -m json.tool

{
    "Repositories": {
        "nginx": {
            "nginx:latest": "sha256:e81eb098537d6c4a75438eacc6a2ed94af74ca168076f719f3a0558bd24d646a",
            "nginx@sha256:31b8e90a349d1fce7621f5a5a08e4fc519b634f7d3feb09d53fac9b12aa4d991": "sha256:e81eb098537d6c4a75438eacc6a2ed94af74ca168076f719f3a0558bd24d646a"
        }
    }
}

This image is made up of 3 image layers (data):

root@ubuntu:~# tree -R -L 2 /var/lib/docker/overlay2/

/var/lib/docker/overlay2/
├── b8551a26cd364214406c2cec1e64ea115ec025ba6d23172a9cb4012e58d6a0f5
│   ├── diff
│   ├── link
│   ├── lower
│   └── work
├── d4b863030b2f01b90868e77c7b83be8e24e1958efae7d8ac53bb4c858ba1fa3b
│   ├── diff
│   ├── link
│   ├── lower
│   └── work
├── ec1ee7a13bfe58ae85641434a941946314a1c9fb3342ed13ba4a98e4778d260a
│   ├── diff
│   └── link
└── l
    ├── 276G4S52H7YYCV5RQ5DXHIZJMO -> ../d4b863030b2f01b90868e77c7b83be8e24e1958efae7d8ac53bb4c858ba1fa3b/diff
    ├── AD4UF4CJHYFTIGELNJCTDK62QK -> ../ec1ee7a13bfe58ae85641434a941946314a1c9fb3342ed13ba4a98e4778d260a/diff
    └── XOMHG6QGR2RTJI7O46V34SA4DY -> ../b8551a26cd364214406c2cec1e64ea115ec025ba6d23172a9cb4012e58d6a0f5/diff

The new l (lowercase L) directory contains shortened layer identifiers as symbolic links. These identifiers are used to avoid hitting the page size limitation on arguments to the mount command.

The lowest layer contains a file called link, which contains the name of the shortened identifier, and a directory called diff which contains the layer’s contents.

The second-lowest layer, and each higher layer, contain a file called lower, which denotes its parent, and a directory called diff which contains its contents. It also contains a merged directory, which contains the unified contents of its parent layer and itself, and a work directory which is used internally by OverlayFS.

Following the lower file, we can figure out the layers:

+--------------------------------------------------------------------+
|  d4b863030b2f01b90868e77c7b83be8e24e1958efae7d8ac53bb4c858ba1fa3b  |
+--------------------------------------------------------------------+
|  b8551a26cd364214406c2cec1e64ea115ec025ba6d23172a9cb4012e58d6a0f5  |
+--------------------------------------------------------------------+
|  ec1ee7a13bfe58ae85641434a941946314a1c9fb3342ed13ba4a98e4778d260a  |
+--------------------------------------------------------------------+

From Docker1.10 the name of directory is different from the image layer’s id.

The meta-data is:

cat /var/lib/docker/image/overlay2/imagedb/content/sha256/e81eb098537d6c4a75438eacc6a2ed94af74ca168076f719f3a0558bd24d646a | python -m json.tool

It is just the content of docker inspect nginx. And we can find the layers and directories:

"GraphDriver": {
    "Data": {
        "LowerDir": "/var/lib/docker/overlay2/b8551a26cd364214406c2cec1e64ea115ec025ba6d23172a9cb4012e58d6a0f5/diff:/var/lib/docker/overlay2/ec1ee7a13bfe58ae85641434a941946314a1c9fb3342ed13ba4a98e4778d260a/diff",
        "MergedDir": "/var/lib/docker/overlay2/d4b863030b2f01b90868e77c7b83be8e24e1958efae7d8ac53bb4c858ba1fa3b/merged",
        "UpperDir": "/var/lib/docker/overlay2/d4b863030b2f01b90868e77c7b83be8e24e1958efae7d8ac53bb4c858ba1fa3b/diff",
        "WorkDir": "/var/lib/docker/overlay2/d4b863030b2f01b90868e77c7b83be8e24e1958efae7d8ac53bb4c858ba1fa3b/work"
    },
    "Name": "overlay2"
},
"RootFS": {
    "Type": "layers",
    "Layers": [
        "sha256:ef68f6734aa485edf13a8509fe60e4272428deaf63f446a441b79d47fc5d17d3",
        "sha256:876456b964239fb297770341ec7e4c2630e42b64b7bbad5112becb1bd2c72795",
        "sha256:9a8f339aeebe1e8bcef322376e1274360653fb802abd4b94c69ea45a54f71a2b"
    ]
},

The diagram below shows the structure of overlay, which is very similar to overlay2 (just the number of layers changes):

overlay_constructs.jpg

How Container Reads&writes Work with Overlay or Overlay2

This part is from Use the OverlayFS storage driver#How container reads and writes work with overlay or overlay2.

Reading Files

Consider three scenarios where a container opens a file for read access with overlay.

  • The file does not exist in the container layer: If a container opens a file for read access and the file does not already exist in the container (upperdir) it is read from the image (lowerdir). This incurs very little performance overhead.
  • The file only exists in the container layer: If a container opens a file for read access and the file exists in the container (upperdir) and not in the image (lowerdir), it is read directly from the container.
  • The file exists in both the container layer and the image layer: If a container opens a file for read access and the file exists in the image layer and the container layer, the file’s version in the container layer is read. Files in the container layer (upperdir) obscure files with the same name in the image layer (lowerdir).

Modifying Files or Directories

Consider some scenarios where files in a container are modified.

  • Writing to a file for the first time: The first time a container writes to an existing file, that file does not exist in the container (upperdir). The overlay/overlay2 driver performs a copy_up operation to copy the file from the image (lowerdir) to the container (upperdir). The container then writes the changes to the new copy of the file in the container layer. However, OverlayFS works at the file level rather than the block level. This means that all OverlayFS copy_up operations copy the entire file, even if the file is very large and only a small part of it is being modified. This can have a noticeable impact on container write performance. However, two things are worth noting:
    • The copy_up operation only occurs the first time a given file is written to. Subsequent writes to the same file operate against the copy of the file already copied up to the container.
    • OverlayFS only works with two layers. This means that performance should be better than AUFS, which can suffer noticeable latencies when searching for files in images with many layers. This advantage applies to both overlay and overlay2 drivers. overlayfs2 is slightly less performant than overlayfs on initial read, because it must look through more layers, but it caches the results so this is only a small penalty.
  • Deleting files and directories:
    • When a file is deleted within a container, a whiteout file is created in the container (upperdir). The version of the file in the image layer (lowerdir) is not deleted (because the lowerdir is read-only). However, the whiteout file prevents it from being available to the container.
    • When a directory is deleted within a container, an opaque directory is created within the container (upperdir). This works in the same way as a whiteout file and effectively prevents the directory from being accessed, even though it still exists in the image (lowerdir).
  • Renaming directories: Calling rename(2) for a directory is allowed only when both the source and the destination path are on the top layer. Otherwise, it returns EXDEV error (“cross-device link not permitted”). Your application needs to be designed to handle EXDEV and fall back to a “copy and unlink” strategy.

Experiment: Dirty Our Hands

Now let’s have fun with Overlay!

Firstly check whether your system supports overlay-filesystem:

root@ubuntu:~# cat /proc/filesystems | grep overlay
nodev	overlay

bingo!

Supposing that we want to construct a building. Firstly we fetch some materials:

root@ubuntu:~# mkdir material
root@ubuntu:~# echo "bad concrete" > material/concrete
root@ubuntu:~# echo "rebar" > material/rebar

And we find the concrete is bad, so we fetch something good and at the same time, we get marble:

root@ubuntu:~# mkdir material2
root@ubuntu:~# echo "good concrete" > material2/concrete
root@ubuntu:~# echo "marble" > material2/marble

Now all the things are OK. Let’s create some directories and build is the constructing layer (upperdir):

root@ubuntu:~# mkdir merge work build
root@ubuntu:~# ls
build  material  material2  merge  work

Then mount the overlay filesystem:

root@ubuntu:~# mount -t overlay overlay -olowerdir=material2:material,upperdir=build,workdir=work merge

Now let’s see what happens:

root@ubuntu:~# tree ./*
./build
./material
├── concrete
└── rebar
./material2
├── concrete
└── marble
./merge
├── concrete
├── marble
└── rebar
./work
└── work

root@ubuntu:~# cat merge/concrete
good concrete

It is so interesting! The good concrete is ready in merge (You must learn about the essence of merge from the parts before if you want to know why).

Now we create our main structure in merge:

root@ubuntu:~# echo "main structure" > merge/frame
root@ubuntu:~# tree ./*
./build
└── frame
./material
├── concrete
└── rebar
./material2
├── concrete
└── marble
./merge
├── concrete
├── frame
├── marble
└── rebar
./work
└── work

You should read the part How Container Reads&writes Work with Overlay or Overlay2 carefully and then you will know what happened.

Now let’s say that our guest does not want the marble, so we should delete it:

root@ubuntu:~# rm merge/marble
root@ubuntu:~# tree ./*
./build
├── frame
└── marble
./material
├── concrete
└── rebar
./material2
├── concrete
└── marble
./merge
├── concrete
├── frame
└── rebar
./work
└── work

root@ubuntu:~# ls -l build/marble
c--------- 1 root root 0, 0 Nov 18 06:19 build/marble

Look! The marble is not really deleted, but replaced with a whiteout file!

That’s all. Thank you!

root@ubuntu:~# umount ./merge/

Architecture of Registry

abc