Bug 1396022: combine docker docs, talk about hsahes; r=garndt
authorDustin J. Mitchell <dustin@mozilla.com>
Fri, 01 Sep 2017 17:47:46 +0000
changeset 428390 237ae943564521f3fcac1300d1207f0cb9a01e79
parent 428389 1bf2e9fab3ac2a60f0f5e446e7fe29c71230dadd
child 428391 979541296fc3230821ad18f2b69748d70199b96f
push id7761
push userjlund@mozilla.com
push dateFri, 15 Sep 2017 00:19:52 +0000
treeherdermozilla-beta@c38455951db4 [default view] [failures only]
perfherder[talos] [build metrics] [platform microbench] (compared to previous push)
reviewersgarndt
bugs1396022
milestone57.0a1
first release with
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
last release without
nightly linux32
nightly linux64
nightly mac
nightly win32
nightly win64
Bug 1396022: combine docker docs, talk about hsahes; r=garndt MozReview-Commit-ID: A27Qoemw2T3
taskcluster/docker/README.md
taskcluster/docs/docker-images.rst
deleted file mode 100644
--- a/taskcluster/docker/README.md
+++ /dev/null
@@ -1,161 +0,0 @@
-# Docker Images for use in TaskCluster
-
-This folder contains various docker images used in [taskcluster](http://docs.taskcluster.net/) as well as other misc docker images which may be useful for
-hacking on gecko.
-
-## Organization
-
-Each folder describes a single docker image.  We have two types of images that can be defined:
-
-1. [Task Images (build-on-push)](#task-images-build-on-push)
-2. [Docker Images (prebuilt)](#docker-registry-images-prebuilt)
-
-These images depend on one another, as described in the [`FROM`](https://docs.docker.com/v1.8/reference/builder/#from)
-line at the top of the Dockerfile in each folder.
-
-Images could either be an image intended for pushing to a docker registry, or one that is meant either
-for local testing or being built as an artifact when pushed to vcs.
-
-### Task Images (build-on-push)
-
-Images can be uploaded as a task artifact, [indexed](#task-image-index-namespace) under
-a given namespace, and used in other tasks by referencing the task ID.
-
-Important to note, these images do not require building and pushing to a docker registry, and are
-build per push (if necessary) and uploaded as task artifacts.
-
-The decision task that is run per push will [determine](#context-directory-hashing)
-if the image needs to be built based on the hash of the context directory and if the image
-exists under the namespace for a given branch.
-
-As an additional convenience, and a precaution to loading images per branch, if an image
-has been indexed with a given context hash for mozilla-central, any tasks requiring that image
-will use that indexed task.  This is to ensure there are not multiple images built/used
-that were built from the same context. In summary, if the image has been built for mozilla-central,
-pushes to any branch will use that already built image.
-
-To use within an in-tree task definition, the format is:
-
-```yaml
-image:
-  type: 'task-image'
-  path: 'public/image.tar.zst'
-  taskId: '{{#task_id_for_image}}builder{{/task_id_for_image}}'
-```
-
-##### Context Directory Hashing
-
-Decision tasks will calculate the sha256 hash of the contents of the image
-directory and will determine if the image already exists for a given branch and hash
-or if a new image must be built and indexed.
-
-Note: this is the contents of *only* the context directory, not the
-image contents.
-
-The decision task will:
-1. Recursively collect the paths of all files within the context directory
-2. Sort the filenames alphabetically to ensure the hash is consistently calculated
-3. Generate a sha256 hash of the contents of each file.
-4. All file hashes will then be combined with their path and used to update the hash
-of the context directory.
-
-This ensures that the hash is consistently calculated and path changes will result
-in different hashes being generated.
-
-##### Task Image Index Namespace
-
-Images that are built on push and uploaded as an artifact of a task will be indexed under the
-following namespaces.
-
-* docker.images.v2.level-{level}.{image_name}.latest
-* docker.images.v2.level-{level}.{image_name}.pushdate.{year}.{month}-{day}-{pushtime}
-* docker.images.v2.level-{level}.{image_name}.hash.{context_hash}
-
-Not only can images be browsed by the pushdate and context hash, but the 'latest' namespace
-is meant to view the latest built image.  This functions similarly to the 'latest' tag
-for docker images that are pushed to a registry.
-
-### Docker Registry Images (prebuilt)
-
-***Deprecation Warning: Use of prebuilt images should only be used for base images (those that other images
-will inherit from), or private images that must be stored in a private docker registry account.  Existing
-public images will be converted to images that are built on push and any newly added image should
-follow this pattern.***
-
-These are images that are intended to be pushed to a docker registry and used by specifying the
-folder name in task definitions.  This information is automatically populated by using the 'docker_image'
-convenience method in task definitions.
-
-Example:
-  image: {#docker_image}builder{/docker_image}
-
-Each image has a hash and a version, given by its `HASH` and `VERSION` files.
-When rebuilding a prebuilt image the `VERSION` should be bumped. Once a new
-version of the image has been built the `HASH` file should be updated with the
-hash of the image.
-
-The `HASH` file is the image hash as computed by docker, this is always on the
-format `sha256:<digest>`. Note that Docker produces a numbre of hashes in this
-format; the hash used in this context is the one returned from `docker push`.
-
-In production images will be referenced by image hash.  This mitigates attacks
-against the registry as well as simplifying validate of correctness. The
-`VERSION` file only serves to provide convenient names, such that old versions
-are easy to discover in the registry (and ensuring old versions aren't deleted
-by garbage-collection).
-
-This way, older tasks which were designed to run on an older version of the image
-can still be executed in taskcluster, while new tasks can use the new version.
-Further more, this mitigates attacks against the registry as docker will verify
-the image hash when loading the image.
-
-Each image also has a `REGISTRY`, defaulting to the `REGISTRY` in this directory,
-and specifying the image registry to which the completed image should be uploaded.
-
-## Building images
-
-Generally, images can be pulled from the [registry](./REGISTRY) rather than
-built locally, however, for developing new images it's often helpful to hack on
-them locally.
-
-To build an image, invoke `mach taskcluster-build-image` with the name of the
-folder (without a trailing slash):
-```sh
-./mach taskcluster-build-image <image-name>
-```
-
-This is a tiny wrapper around `docker build -t $REGISTRY/$FOLDER:$VERSION`.
-Once a new version image has been built and pushed to the remote registry using
-`docker push $REGISTRY/$FOLDER:$VERSION` the `HASH` file must be updated for the
-change to effect in production.
-
-Note: If no "VERSION" file present in the image directory, the tag 'latest' will be used and no
-registry will be defined. The image is only meant to run locally and will overwrite
-any existing image with the same name and tag.
-
-## Adding a new image
-
-The docker image primitives are very basic building block for
-constructing an "image" but generally don't help much with tagging it
-for deployment so we have a wrapper (./build.sh) which adds some sugar
-to help with tagging/versioning... Each folder should look something
-like this:
-
-```
-  - your_amazing_image/
-    - your_amazing_image/Dockerfile: Standard docker file syntax
-    - your_amazing_image/VERSION: The version of the docker file
-      (required* used during tagging)
-    - your_amazing_image/REGISTRY: Override default registry
-      (useful for secret registries)
-```
-
-## Conventions
-
-In some image folders you will see `.env` files these can be used in
-conjunction with the `--env-file` flag in docker to provide a
-environment with the given environment variables. These are primarily
-for convenience when manually hacking on the images.
-
-You will also see a `system-setup.sh` script used to build the image.
-Do not replicate this technique - prefer to include the commands and options directly in the Dockerfile.
--- a/taskcluster/docs/docker-images.rst
+++ b/taskcluster/docs/docker-images.rst
@@ -3,20 +3,183 @@
 =============
 Docker Images
 =============
 
 TaskCluster Docker images are defined in the source directory under
 ``taskcluster/docker``. Each directory therein contains the name of an
 image used as part of the task graph.
 
-More information is available in the ``README.md`` file in that directory.
+Organization
+------------
+
+Each folder describes a single docker image.  We have two types of images that can be defined:
+
+1. Task Images (build-on-push)
+2. Docker Images (prebuilt)
+
+These images depend on one another, as described in the `FROM
+<https://docs.docker.com/v1.8/reference/builder/#from>`_ line at the top of the
+Dockerfile in each folder.
+
+Images could either be an image intended for pushing to a docker registry, or
+one that is meant either for local testing or being built as an artifact when
+pushed to vcs.
+
+Task Images (build-on-push)
+:::::::::::::::::::::::::::
+
+Images can be uploaded as a task artifact, [indexed](#task-image-index-namespace) under
+a given namespace, and used in other tasks by referencing the task ID.
+
+Important to note, these images do not require building and pushing to a docker registry, and are
+built per push (if necessary) and uploaded as task artifacts.
+
+The decision task that is run per push will [determine](#context-directory-hashing)
+if the image needs to be built based on the hash of the context directory and if the image
+exists under the namespace for a given branch.
+
+As an additional convenience, and a precaution to loading images per branch, if an image
+has been indexed with a given context hash for mozilla-central, any tasks requiring that image
+will use that indexed task.  This is to ensure there are not multiple images built/used
+that were built from the same context. In summary, if the image has been built for mozilla-central,
+pushes to any branch will use that already built image.
+
+To use within an in-tree task definition, the format is:
+
+```yaml
+image:
+  type: 'task-image'
+  path: 'public/image.tar.zst'
+  taskId: '{{#task_id_for_image}}builder{{/task_id_for_image}}'
+```
+
+Context Directory Hashing
+.........................
+
+Decision tasks will calculate the sha256 hash of the contents of the image
+directory and will determine if the image already exists for a given branch and hash
+or if a new image must be built and indexed.
+
+Note: this is the contents of *only* the context directory, not the
+image contents.
+
+The decision task will:
+1. Recursively collect the paths of all files within the context directory
+2. Sort the filenames alphabetically to ensure the hash is consistently calculated
+3. Generate a sha256 hash of the contents of each file.
+4. All file hashes will then be combined with their path and used to update the hash
+of the context directory.
+
+This ensures that the hash is consistently calculated and path changes will result
+in different hashes being generated.
+
+Task Image Index Namespace
+..........................
+
+Images that are built on push and uploaded as an artifact of a task will be indexed under the
+following namespaces.
+
+* docker.images.v2.level-{level}.{image_name}.latest
+* docker.images.v2.level-{level}.{image_name}.pushdate.{year}.{month}-{day}-{pushtime}
+* docker.images.v2.level-{level}.{image_name}.hash.{context_hash}
+
+Not only can images be browsed by the pushdate and context hash, but the 'latest' namespace
+is meant to view the latest built image.  This functions similarly to the 'latest' tag
+for docker images that are pushed to a registry.
+
+Docker Registry Images (prebuilt)
+:::::::::::::::::::::::::::::::::
+
+***Warning: Use of prebuilt images should only be used for base images (those that other images
+will inherit from), or private images that must be stored in a private docker registry account.***
 
-Adding Extra Files to Images
-============================
+These are images that are intended to be pushed to a docker registry and used
+by specifying the docker image name in task definitions.  They are generally
+referred to by a ``<repo>@<repodigest>`` string:
+
+Example:
+
+.. code-block:: none
+
+    image: taskcluster/decision:0.1.10@sha256:c5451ee6c655b3d97d4baa3b0e29a5115f23e0991d4f7f36d2a8f793076d6854
+
+Each image has a repo digest, an image hash, and a version. The repo digest is
+stored in the ``HASH`` file in the image directory  and used to refer to the
+image as above.  The version is in ``VERSION``.  The image hash is used in
+chain-of-trust verification in `scriptworker
+<https://github.com/mozilla-releng/scriptworker>`_.
+
+The version file only serves to provide convenient names, such that old
+versions are easy to discover in the registry (and ensuring old versions aren't
+deleted by garbage-collection).
+
+Each image directory also has a ``REGISTRY``, defaulting to the ``REGISTRY`` in
+the ``taskcluster/docker`` directory, and specifying the image registry to
+which the completed image should be uploaded.
+
+Docker Hashes and Digests
+.........................
+
+There are several hashes involved in this process:
+
+ * Image Hash -- the long version of the image ID; can be seen with
+   ``docker images --no-trunc`` or in the ``Id`` field in ``docker inspect``.
+
+ * Repo Digest -- hash of the image manifest; seen when running ``docker
+   push`` or ``docker pull``.
+
+ * Context Directory Hash -- see above (not a Docker concept at all)
+
+The use of hashes allows older tasks which were designed to run on an older
+version of the image to be executed in Taskcluster while new tasks use the new
+version.  Furthermore, this mitigates attacks against the registry as docker
+will verify the image hash when loading the image.
+
+(Re)-Building images
+--------------------
+
+Generally, images can be pulled from the Docker registry rather than built
+locally, however, for developing new images it's often helpful to hack on them
+locally.
+
+To build an image, invoke ``mach taskcluster-build-image`` with the name of the
+folder (without a trailing slash):
+
+.. code-block:: none
+
+    ./mach taskcluster-build-image <image-name>
+
+This is a wrapper around ``docker build -t $REGISTRY/$FOLDER:$VERSION``.
+
+It's a good idea to bump the ``VERSION`` early in this process, to avoid
+``docker push``-ing  over any old tags.
+
+For task images, test your image locally or push to try. This is all that is
+required.
+
+Docker Registry Images
+::::::::::::::::::::::
+
+Landing docker registry images takes a little more care.
+
+Once a new version of the image has been built and tested locally, push it to
+the docker registry and make note of the resulting repo digest.  Put this value
+in the ``HASH`` file, and update any references to the image in the code or
+task definitions.
+
+The change is now safe to use in Try pushes.  However, if the image is used in
+building releases then it is *not* safe to land to an integration branch until
+the whitelists in `scriptworker
+<https://github.com/mozilla-releng/scriptworker/blob/master/scriptworker/constants.py>`_
+have also been updated. These whitelists use the image hash, not the repo
+digest.
+
+Special Dockerfile Syntax
+-------------------------
 
 Dockerfile syntax has been extended to allow *any* file from the
 source checkout to be added to the image build *context*. (Traditionally
 you can only ``ADD`` files from the same directory as the Dockerfile.)
 
 Simply add the following syntax as a comment in a Dockerfile::
 
    # %include <path>