taskcluster/docs/how-tos.rst
author Andrew Halberstadt <ahalberstadt@mozilla.com>
Tue, 27 Jun 2017 13:33:20 -0700
changeset 422889 7af941648ca6696d49cbeb5f37b7c81b9292e824
parent 421034 ec7c309bf6f168ecc579a327bc1ef116d8d292ef
child 427442 a43cb4709418ca9b44a9328d79fa635fb6418c4b
permissions -rw-r--r--
Bug 1380306 - Create a new 'try_task_config' method for scheduling tasks on try, r=dustin This introduces a 'try_task_config' method of scheduling. En lieu of (or in addition to) try syntax, you can now check in a file called 'try_task_config.json' to the root of the source tree. The format is either a list of task labels, or dict where task labels are the keys. Taskcluster will simply schedule any tasks that are listed there. This file is primarily meant to be generated by tools (which don't exist yet), as the json format is much easier for tools to generate or consume. These tools should use an in-memory commit to add the file so it is automatically removed again after the push. A server-side hook will be added in bug 1380357 to prevent this file from accidentally landing on non-try trees. MozReview-Commit-ID: 2zKfZXuuDhH

How Tos
=======

All of this equipment is here to help you get your work done more efficiently.
However, learning how task-graphs are generated is probably not the work you
are interested in doing.  This section should help you accomplish some of the
more common changes to the task graph with minimal fuss.

.. important::

    If you cannot accomplish what you need with the information provided here,
    please consider whether you can achieve your goal in a different way.
    Perhaps something simpler would cost a bit more in compute time, but save
    the much more expensive resource of developers' mental bandwidth.
    Task-graph generation is already complex enough!

    If you want to proceed, you may need to delve into the implementation of
    task-graph generation.  The documentation and code are designed to help, as
    are the authors - ``hg blame`` may help track down helpful people.

    As you write your new transform or add a new kind, please consider the next
    developer.  Where possible, make your change data-driven and general, so
    that others can make a much smaller change.  Document the semantics of what
    you are changing clearly, especially if it involves modifying a transform
    schema.  And if you are adding complexity temporarily while making a
    gradual transition, please open a new bug to remind yourself to remove the
    complexity when the transition is complete.

Hacking Task Graphs
-------------------

The recommended process for changing task graphs is this:

1. Find a recent decision task on the project or branch you are working on, and
   download its ``parameters.yml`` artifact.  Alternately, you
   can simply take note of the artifact URL, or just the decision task's
   ``task-id``.  This file contains all of the inputs to the task-graph
   generation process.  Its contents are simple enough if you would like to
   modify it, and it is documented in :doc:`parameters`.

2. Run one of the ``mach taskgraph`` subcommands (see :doc:`taskgraph`) to
   generate a baseline against which to measure your changes, passing the
   parameters you found in the previous step.  For example:

   .. code-block:: none

       ./mach taskgraph tasks --json -p parameters.yml > old-tasks.json
       ./mach taskgraph tasks --json -p url/to/parameters.yml > old-tasks.json
       ./mach taskgraph tasks --json -p task-id=<task-id> > old-tasks.json

3. Make your modifications under ``taskcluster/``.

4. Run the same ``mach taskgraph`` command, sending the output to a new file,
   and use ``diff`` to compare the old and new files.  Make sure your changes
   have the desired effect and no undesirable side-effects.

5. When you are satisfied with the changes, push them to try to ensure that the
   modified tasks work as expected.

Hacking Actions
...............

If you are working on an action task and wish to test it out locally, use the
``./mach taskgraph test-action-callback`` command:

   .. code-block:: none

        ./mach taskgraph test-action-task \
            --task-id I4gu9KDmSZWu3KHx6ba6tw --task-group-id sMO4ybV9Qb2tmcI1sDHClQ \
            -p parameters.yml --input input.yml \
            hello_world_action

This invocation will run the hello world callback with the given inputs and
print any created tasks to stdout, rather than actually creating them.

Common Changes
--------------

Changing Test Characteristics
.............................

First, find the test description.  This will be in
``taskcluster/ci/*/tests.yml``, for the appropriate kind (consult
:doc:`kinds`).  You will find a YAML stanza for each test suite, and each
stanza defines the test's characteristics.  For example, the ``chunks``
property gives the number of chunks to run.  This can be specified as a simple
integer if all platforms have the same chunk count, or it can be keyed by test
platform.  For example:

.. code-block:: yaml

    chunks:
        by-test-platform:
            linux64/debug: 10
            default: 8

The full set of available properties is in
``taskcluster/taskgraph/transform/tests/test_description.py``.  Some other
commonly-modified properties are ``max-run-time`` (useful if tests are being
killed for exceeding maxRunTime) and ``treeherder-symbol``.

.. note::

    Android tests are also chunked at the mozharness level, so you will need to
    modify the relevant mozharness config, as well.

Adding a Test Suite
...................

To add a new test suite, you will need to know the proper mozharness invocation
for that suite, and which kind it fits into (consult :doc:`kinds`).

Add a new stanza to ``taskcluster/ci/<kind>/tests.yml``, copying from the other
stanzas in that file.  The meanings should be clear, but authoritative
documentation is in
``taskcluster/taskgraph/transform/tests/test_description.py`` should you need
it.  The stanza name is the name by which the test will be referenced in try
syntax.

Add your new test to a test set in ``test-sets.yml`` in the same directory.  If
the test should only run on a limited set of platforms, you may need to define
a new test set and reference that from the appropriate platforms in
``test-platforms.yml``.  If you do so, include some helpful comments in
``test-sets.yml`` for the next person.

Greening Up a New Test
......................

When a test is not yet reliably green, configuration for that test should not
be landed on integration branches.  Of course, you can control where the
configuration is landed!  For many cases, it is easiest to green up a test in
try: push the configuration to run the test to try along with your work to fix
the remaining test failures.

When working with a group, check out a "twig" repository to share among your
group, and land the test configuration in that repository.  Once the test is
green, merge to an integration branch and the test will begin running there as
well.

Adding a New Task
.................

If you are adding a new task that is not a test suite, there are a number of
options.  A few questions to consider:

 * Is this a new build platform or variant that will produce an artifact to
   be run through the usual test suites?

 * Does this task depend on other tasks?  Do other tasks depend on it?

 * Is this one of a few related tasks, or will you need to generate a large
   set of tasks using some programmatic means (for example, chunking)?

 * How is the task actually excuted?  Mozharness?  Mach?

 * What kind of environment does the task require?

Armed with that information, you can choose among a few options for
implementing this new task.  Try to choose the simplest solution that will
satisfy your near-term needs.  Since this is all implemented in-tree, it
is not difficult to refactor later when you need more generality.

Existing Kind
`````````````

The simplest option is to add your task to an existing kind.  This is most
practical when the task "makes sense" as part of that kind -- for example, if
your task is building an installer for a new platform using mozharness scripts
similar to the existing build tasks, it makes most sense to add your task to
the ``build`` kind.  If you need some additional functionality in the kind,
it's OK to modify the implementation as necessary, as long as the modification
is complete and useful to the next developer to come along.

Tasks in the ``build`` kind generate Firefox installers, and the ``test`` kind
will add a full set of Firefox tests for each ``build`` task.

New Kind
````````

The next option to consider is adding a new kind.  A distinct kind gives you
some isolation from other task types, which can be nice if you are adding an
experimental kind of task.

Kinds can range in complexity.  The simplest sort of kind uses the transform
loader to read a list of jobs from the ``jobs`` key, and applies the standard
``job`` and ``task`` transforms:

.. code-block:: yaml

    implementation: taskgraph.task.transform:TransformTask
    transforms:
       - taskgraph.transforms.job:transforms
       - taskgraph.transforms.task:transforms
    jobs:
       - ..your job description here..

Job descriptions are defined and documented in
``taskcluster/taskgraph/transforms/job/__init__.py``.

Custom Kind Loader
``````````````````

If your task depends on other tasks, then the decision of which tasks to create
may require some code.  For example, the ``test`` kind iterates over
the builds in the graph, generating a full set of test tasks for each one.  This specific
post-build behavior is implemented as a loader defined in ``taskcluster/taskgraph/loader/test.py``.

A custom loader is useful when the set of tasks you want to create is not
static but based on something else (such as the available builds) or when the
dependency relationships for your tasks are complex.

Custom Transforms
`````````````````

Most loaders apply a series of ":doc:`transforms <transforms>`" that start with
an initial human-friendly description of a task and end with a task definition
suitable for insertion into a Taskcluster queue.

Custom transforms can be useful to apply defaults, simplifying the YAML files
in your kind. They can also apply business logic that is more easily expressed
in code than in YAML.

Transforms need not be one-to-one: a transform can produce zero or more outputs
for each input. For example, the test transforms perform chunking by producing
an output for each chunk of a given input.

Ideally those transforms will produce job descriptions, so you can use the
existing ``job`` and ``task`` transforms:

.. code-block:: yaml

    transforms:
       - taskgraph.transforms.my_stuff:transforms
       - taskgraph.transforms.job:transforms
       - taskgraph.transforms.task:transforms

Try to keep transforms simple, single-purpose and well-documented!

Custom Run-Using
````````````````

If the way your task is executed is unique (so, not a mach command or
mozharness invocation), you can add a new implementation of the job
description's "run" section.  Before you do this, consider that it might be a
better investment to modify your task to support invocation via mozharness or
mach, instead.  If this is not possible, then adding a new file in
``taskcluster/taskgraph/transforms/jobs`` with a structure similar to its peers
will make the new run-using option available for job descriptions.

Something Else?
...............

If you make another change not described here that turns out to be simple or
common, please include an update to this file in your patch.


Schedule a Task on Try
----------------------

There are two methods for scheduling a task on try.

The first method is a command line string called ``try syntax`` which is passed
into the decision task via the commit message. An example try syntax might look
like:

.. parsed-literal::

    try: -b o -p linux64 -u mochitest-1 -t none

This gets parsed by ``taskgraph.try_option_syntax:TryOptionSyntax`` and returns
a list of matching task labels. For more information see the
`TryServer wiki page <https://wiki.mozilla.org/Try>`_.

The second method uses a checked-in file called ``try_task_config.json`` which
lives at the root of the source dir. The format of this file is either a list
of task labels, or a JSON object where task labels make up the keys. For
example, the ``try_task_config.json`` file might look like:

.. parsed-literal::

    [
      "test-windows10-64/opt-web-platform-tests-12",
      "test-windows7-32/opt-reftest-1",
      "test-windows7-32/opt-reftest-2",
      "test-windows7-32/opt-reftest-3",
      "build-linux64/debug",
      "source-test-mozlint-eslint"
    ]

Very simply, this will run any task label that gets passed in as well as their
dependencies. While it is possible to manually commit this file and push to
try, it is mainly meant to be a generation target for various trychooser tools.

A list of all possible task labels can be obtained by running:

.. parsed-literal::

    $ ./mach taskgraph tasks

A list of task labels relevant to a tree (defaults to mozilla-central) can be
obtained with:

.. parsed-literal::

    $ ./mach taskgraph target