docs: improve running locally and debugging pages (no bug), r=releng-reviewers,jcristau
authorAndrew Halberstadt <ahal@mozilla.com>
Wed, 23 Mar 2022 14:02:47 +0000
changeset 386 51f75d1727026eb1efa2e2105e431ef804d944e4
parent 385 49b51b0a80dbedcb4f60ac92b9e39e439a91df67
child 387 50fedb350a6f9e137ca4b42a95fc285d4a3fe88b
push id213
push userahalberstadt@mozilla.com
push dateWed, 23 Mar 2022 14:04:50 +0000
treeherdertaskgraph@51f75d172702 [default view] [failures only]
perfherder[talos] [build metrics] [platform microbench] (compared to previous push)
reviewersreleng-reviewers, jcristau
docs: improve running locally and debugging pages (no bug), r=releng-reviewers,jcristau Jira: RELENG-709 Differential Revision: https://phabricator.services.mozilla.com/D141790
docs/concepts/task-graphs.rst
docs/howto/debugging.rst
docs/howto/debugging_and_testing.rst
docs/howto/index.rst
docs/howto/run-locally.rst
docs/tutorials/connecting-taskcluster.rst
docs/tutorials/creating-a-task-graph.rst
--- a/docs/concepts/task-graphs.rst
+++ b/docs/concepts/task-graphs.rst
@@ -83,17 +83,17 @@ each phase locally:
    # generates the full_task_graph and exits
    $ taskgraph full
    # generates the target_task_graph and exits
    $ taskgraph target
    # etc..
 
 The decision task uses the command ``taskgraph decision``, which spans the
 whole generation process all the way to task creation in the last step. See
-:doc:`/howto/debugging_and_testing` for more information on running the
+:doc:`/howto/run-locally` for more information on running the
 ``taskgraph`` command locally.
 
 Transitive Closure
 ..................
 
 Transitive closure is a fancy name for this sort of operation:
 
  * start with a set of tasks
rename from docs/howto/debugging_and_testing.rst
rename to docs/howto/debugging.rst
--- a/docs/howto/debugging_and_testing.rst
+++ b/docs/howto/debugging.rst
@@ -1,76 +1,82 @@
-Debugging and Testing
-=====================
+Debug Taskgraph
+===============
+
+There are several approaches you could take when you need to debug changes
+in Taskgraph.
 
-When you first get started you'll notice that pushing changes to your PR (or testing branch) then waiting for all tasks to complete is a slow and wasteful way to iterate.
-Luckily there are several tools you can use locally to verify that changes you make to kinds and custom transforms or loaders
-(if used), don't error out due to bugs.
+Print Debugging
+---------------
+
+It's possible to add print statements to transforms which will show up in the
+log. Just beware that because transforms tend to loop over every task, it can
+be difficult to tell which output was generated from the task(s) you're trying
+to debug.
 
-Tools for Local Development
----------------------------
+Here's an example pattern you can use to limit debug output:
+
+.. code-block:: python
 
-There are several debugging tools built into Taskgraph that can be used to either view a diff of changes or to generate a local graph based on pre-defined parameters. 
-See :ref:`working-on-taskgraph` for local setup instructions.
- 
-To generate a local graph, cd into your project (make sure its in Taskgraphs PATH) and download a parameters.yml file from a recent Decision task (click
-into the Taskgraph UI and click on a Decision tasks Artifacts list). Parameters are passed to a Decision task and provide project and task details,
-such as what type of tasks to run (default, nightly, beta or release.)
+   @transforms.add
+   def some_transform(config, tasks):
+       for task in tasks:
+           def debug(msg):
+               # Note "name" is not a standard field, use whatever field
+               # is appropriate here.
+               if task["name"] == "my-task":
+                   print(msg, file=sys.stderr)
+
+           # Will only be printed when "my-task" is being processed.
+           debug("FOO")
 
-Run this command to generate the local graph. You'll see the `kinds` it will generate and will throw
-errors if you are missing required attributes in yaml files or if there is a schema validation bug. 
+pdb
+---
 
-.. code-block::
+It's also possible to use the built-in Python debugger, `pdb`_. Simply add
+calls to ``pdb.set_trace()`` and run the ``taskgraph`` binary as normal. Note
+that as with print debugging, you'll want to ensure that breakpoints are only
+triggered on the tasks you are aiming to debug:
 
-  taskgraph target-graph -p <path-to-your-parameters.yml> 
-  taskgraph target-graph -p task-id=<decision task id> # alternatively, you pass the decision task id with the parameters you want to use
+.. code-block:: python
 
-One of the most useful aspects of this command, is that you can use it to verify that you are generating the 
-correct graph of tasks by changing the `target_tasks_method` from `default` to `release` or `nightly` (assuming you have this type of support already set up).
+   @transforms.add
+   def some_transform(config, tasks):
+       import pdb
+       for task in tasks:
+           # Note "name" is not a standard field, use whatever field
+           # is appropriate here.
+           if task["name"] == "my-task":
+               pdb.set_trace()
+
+.. _pdb: https://docs.python.org/3/library/pdb.html
 
-You can also use the `--diff` flag to generate and view diffs (as json blobs) of multiple taskgraphs e.g. github releases, [nightly] cron, [release promotion] actions, github-push, and pull-requests and see
-exactly what the differences are. This can be useful to track down an issue where, for example, github releases is broken but everything else is running smoothly, and to verify that any fix you
-make to the release graph hasn't broken something in the other graphs. If you want to diff against multiple parameters at a time, simply pass `--parameters` multiple times (or pass a directory containing parameters files).
+debugpy
+-------
 
-Debugging
----------
+`Debugpy`_ is the Python debugger that comes bundled with VSCode. But it can be
+used with other editors as well, such as Neovim via the `nvim-dap-python`_
+plugin. The advantage of ``debugpy`` is that you can set breakpoints directly
+via your editor (so you don't need to edit your source). You can trigger new
+runs from within your editor as well.
 
-Step-debugging through functions can be the fastest way to troubleshoot problems. Besides using `pdb`, you may want to have your IDE as a debugger for setting breakpoints and use it's UI for controlling the debug session.
-In VSCode, you can achieve it with the following configuration:
+How to setup ``debugpy`` varies editor by editor and is out of scope for these
+docs. But typically there will be a way of defining a "launch config". For
+``taskgraph`` it should look similar to:
 
 .. code-block::
 
   {
       "name": "Taskgraph Full",
       "type": "python",
       "request": "launch",
-      "program": "/Users/myuser/.pyenv/versions/projectname/bin/taskgraph",
-      "args": ["full", "-p", "taskcluster/test/params/main-push.yml"],
+      "program": "/path/to/bin/taskgraph",
+      "args": ["full"],
       "console": "integratedTerminal",
       "justMyCode": false
   }
 
-`Note: You should adjust "program" as needed. To find the taskgraph executable when using pyenv, use the following:`
-
-.. code::
-
-  pyenv which taskgraph
-
-Retriggering and Rerunning Tasks
---------------------------------
-
-These can either be triggered in the Taskcluster UI (by clicking into the appropriate task, and hoovering over the bubble in the bottom right corner), via the Taskcluster CLI
-or in the Treeherder UI (currently only pushes on `main` and `master` are consumed and displayed).
+Make sure to adjust "program" to point to your ``taskgraph`` binary. Also
+be sure to tweak args as you see fit. See :ref:`useful arguments` for more
+information on available arguments.
 
-Staging Repositories
---------------------
-
-Release Engineering typically creates a staging repository of Github projects that use Taskgraph. These clones of production projects live in `mozilla-releng <https://github.com/mozilla-releng>`_ 
-and are managed by Release Engineering. 
-
-These staging repos have historically been created for Releng use, but we are now opening them up for members of each project to also use (if your Mozilla project uses Taskgraph/Taskcluster and does not
-currently have a staging repo you can access, please reach out to someone on the Releng team).
-
-These staging repos are very useful for testing changes to secrets (with the caveat, you need to be careful you're not also affecting production secrets),
-or simulating release actions, such as pushing apks to the Google Play Store without actually doing so.
-
-Something to note, is that the scriptworker-scripts that are used for certain tasks (defined in a projects `taskcluster/ci/config.yml` file) will still point to production scriptworkers unless you modfiy the config file
-to point to <some-scriptworker>-dev (be sure to change it back before pushing to a production repository). You can see all of the available dev options `here <https://scriptworker-scripts.readthedocs.io/en/latest/README.html#overview-of-existing-workers>`_.
+.. _debugpy: https://github.com/microsoft/debugpy
+.. _nvim-dap-python: https://github.com/mfussenegger/nvim-dap-python
--- a/docs/howto/index.rst
+++ b/docs/howto/index.rst
@@ -1,10 +1,11 @@
 How To
 ======
 
 A collection of how-to guides.
 
 .. toctree::
    :maxdepth: 1
 
-   debugging_and_testing
+   run-locally
+   debugging
    bootstrap-taskgraph
new file mode 100644
--- /dev/null
+++ b/docs/howto/run-locally.rst
@@ -0,0 +1,242 @@
+Run Taskgraph Locally
+=====================
+
+When first starting out with Taskgraph, it's tempting to test changes by
+pushing to a pull request (or try server) and checking whether your
+modifications have the desired affect on the impacted task(s). This isn't ideal
+because the turn around time is slow, you may hit easily preventable errors in
+the :term:`Decision Task`, and it wastes money running tasks that are
+irrelevant to your changes.
+
+So before you even push your changes, it's best practice to verify whether
+graph generation succeeds as well as to sanity check that your changes are
+having the desired affect.
+
+Generating Graphs
+-----------------
+
+.. note::
+
+   If you haven't done so already, make sure :ref:`Taskgraph is installed
+   <installation>`.
+
+Graphs can be generated via the ``taskgraph`` binary. This binary provides
+subcommands that correspond to the :ref:`phases of graph generation <graph
+generation>`. For instance:
+
+* Running ``taskgraph full`` produces the ``full_task_graph``.
+* Running ``taskgraph target`` produces the ``target_task_graph``.
+* etc.
+
+For a list of available sub-commands, see:
+
+.. code-block:: shell
+
+   taskgraph --help
+
+.. _useful arguments:
+
+Useful Arguments
+~~~~~~~~~~~~~~~~
+
+Here are some useful arguments accepted by most ``taskgraph`` subcommands.
+
+``-J/--json``
++++++++++++++
+
+By default only the task labels are displayed as output, but when ``-J/--json``
+is used, the full JSON representation of all task definitions are displayed.
+
+.. note::
+
+   Using ``-J/--json`` can often result in a massive amount of output. Consider
+   using the ``--tasks`` and/or ``--target-kind`` flags in conjunction to
+   filter the result down to a manageable level.
+
+``--tasks/--tasks-regex``
++++++++++++++++++++++++++
+
+A regular expression that matches against task labels. Useful for filtering
+down the output to only display desired tasks.
+
+``--target-kind``
++++++++++++++++++
+
+Only generate tasks of the given ``kind`` or any kinds listed in that kind's
+``kind-dependencies`` key.
+
+``-p/--parameters``
++++++++++++++++++++
+
+Generate the graph with the specified :term:`parameter set <Parameters>`. The
+following formats are accepted:
+
+* Path to a ``parameters.yml`` file.
+* A value of ``task-id=<decision task id>``. The ``parameters.yml`` artifact
+  from the decision task specified by ``<decision task id>`` will be downloaded
+  and used.
+* A value of ``project=<project>``. The ``parameters.yml`` artifact from the
+  latest decision task on ``<project>`` will be downloaded and used.
+* Path to a directory containing multiple parameter files. Any ``.yml`` file in
+  the directory will be considered a parameter set.
+
+The ``-p/--parameters`` flag can be passed in multiple times. If passed in
+multiple times, or if a directory containing more than one parameter set is
+specified, then one graph generation per parameter set will occur in parallel.
+Log generation will be disabled and output will be captured and emitted at the
+end under parameter specific headings. This feature is primarily useful for
+:ref:`diffing graphs`.
+
+.. note::
+
+   If you do not specify parameters, the default values for each parameter will
+   be used instead. This may result in a different graph than what is generated
+   in CI.
+
+``--fast``
+++++++++++
+
+Skip schema validation and other extraneous checks. This results in a faster
+generation at the expense of correctness.
+
+.. note::
+
+   When using ``--fast`` you may miss errors that will cause the decision task
+   to fail in CI.
+
+Validating Your Changes
+-----------------------
+
+Most changes to your Taskgraph configuration will likely fall under one of two
+buckets:
+
+1. Modifications to the task definitions. This involves changes to the ``kind.yml``
+   files or any transform files that it references.
+2. Modifications to where the task runs. This is a subset of the above, but
+   occurs when you modify values that affect the ``target_task`` phase, such as
+   the ``run-on-projects`` or ``run-on-tasks-for`` keys.
+
+Different testing approaches are needed to validate each type.
+
+Validating Changes to Task Definitions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you're only modifying the definition of tasks, then you want to generate the
+``full_task_graph``. This is because task definitions are frozen (with minor
+exceptions) after this phase. You'll also want to use the ``-J/--json`` flag and
+likely also the ``--tasks`` flag to filter down the result.
+
+For example, let's say you modify a task called ``build-android``. Then you
+would run the following command:
+
+.. code-block:: shell
+
+   taskgraph full -J --tasks "build-android"
+
+Then you can inspect the resulting task definition and validate that everything
+is configured as you expect.
+
+Validating Changes to Where Tasks Run
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you're modifying *where* a task runs, e.g by changing a key that impacts the
+``target_tasks_method`` parameter (such as ``run-on-projects`` or
+``run-on-tasks-for``), you'll want to generate up until the
+``target_task_graph`` phase.
+
+Unlike when modifying the definition, we don't care about the contents of the
+task so passing the ``-J/--json`` flag is unnecessary. Instead, we can simply
+inspect whether the label exists or not. However it *is* important to make sure
+we're generating under the appropriate context(s) via the ``-p/--parameters``
+flag.
+
+For example, let's say you want to modify the ``test-sensitive`` task so it
+runs on pushes to the ``main`` branch, but *does not* run on pull requests
+(because it needs sensitive secrets you don't want to expose to PRs). First you
+would go to the main branch, find a decision task and copy it's ``taskId``.
+
+Then you would run:
+
+.. code-block:: shell
+
+   taskgraph target -p task-id=<decision taskId from main push>
+
+Now you would verify that the ``test-sensitive`` label shows up in the
+resulting output.
+
+Next you would go to a pull request, find a decision task, and again
+copy its ``taskId``. Then you'd again run:
+
+.. code-block:: shell
+
+   taskgraph target -p task-id=<decision taskId from PR>
+
+This time, you'd verify that the label *does not* show up.
+
+.. note::
+
+   If there are certain parameter sets you find yourself needing over and over,
+   consider checking them into your repo under ``taskcluster/test/params``,
+   like the `Fenix repository does`_. This way you can pass a path to the
+   appropriate parameters file rather than searching for a decision task.
+
+.. _Fenix repository does: https://github.com/mozilla-mobile/fenix/tree/main/taskcluster/test/params
+
+.. _diffing graphs:
+
+Diffing Graphs
+--------------
+
+Another strategy for testing your changes is to generate a graph with and
+without your changes, and then diffing the output of the two. Taskgraph has a
+built-in ``--diff`` flag that makes this process simple. Both Mercurial and Git
+are supported.
+
+Because the ``--diff`` flag will actually update your VCS's current directory,
+make sure you don't have any uncommitted changes (the ``taskgraph`` binary will
+error out if you do). Then run:
+
+.. code-block:: shell
+
+   taskgraph full -p <params> --diff
+
+Taskgraph will automatically determine which revision to diff against
+(defaulting to your entire local stack). But you may optionally pass in a
+revision specifier, e.g:
+
+.. code-block:: shell
+
+   # git
+   taskgraph full -p <params> --diff HEAD~1
+
+   # hg
+   taskgraph full -p <params> --diff .~1
+
+Instead of the normal output (either labels or json), a diff will be displayed.
+
+.. note::
+
+   The ``--diff`` flag composes with every other flag on the ``taskgraph``
+   binary. Meaning you can still filter using ``--tasks`` or ``--target-kind``.
+   It can also diff any output format (labels or json).
+
+Excluding Keys from the Diff
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes you might be making changes that impact many tasks (in the case of
+Firefox's CI, this is often thousands). You might have some expected changes
+you know you made, but you want to check that there aren't any *additional*
+changes beyond that. You can pass in the ``--exclude-key`` flag to filter out
+certain properties of the task definition.
+
+For example, let's say you added an environment variable called "FOO" to every
+task. You now want to make sure that you didn't make any changes beyond this, but
+the diff is so large this is difficult. You can run:
+
+.. code-block:: shell
+
+   taskgraph full -p <params> --diff --exclude-key "task.payload.env.FOO"
+
+This will first remove the ``task.payload.env.FOO`` key from every task before
+performing the diff. Ensuring that the only differences left over are the ones
+you didn't expect.
--- a/docs/tutorials/connecting-taskcluster.rst
+++ b/docs/tutorials/connecting-taskcluster.rst
@@ -350,20 +350,20 @@ here <example-taskcluster.yml>`.
 
 
 Testing it Out
 ~~~~~~~~~~~~~~
 
 From here you should be ready to commit to your repo (directly or via pull
 request) and start testing things out! It's very likely that you'll run into
 some error or another at first. If you suspect a problem in the task
-configuration, see :doc:`/howto/debugging_and_testing` for tips on how to solve
-it. Otherwise you might need to tweak the ``.taskcluster.yml`` or make changes
-to your repo's Taskcluster configuration. If the latter is necessary, reach out
-to your Taskcluster administrators for assistance.
+configuration, see :doc:`/howto/run-locally` for tips on how to solve it.
+Otherwise you might need to tweak the ``.taskcluster.yml`` or make changes to
+your repo's Taskcluster configuration. If the latter is necessary, reach out to
+your Taskcluster administrators for assistance.
 
 Phew! While that was a lot, this only scratches the surface. You may also want
 to incorporate:
 
 * Dependencies
 * Artifacts
 * Docker images
 * Action / Cron tasks
--- a/docs/tutorials/creating-a-task-graph.rst
+++ b/docs/tutorials/creating-a-task-graph.rst
@@ -218,11 +218,11 @@ Now run:
 The ``-J/--json`` flag will display the full JSON definition of your task.
 Morphed is the final phase of :ref:`graph generation <graph generation>`, so
 represents your task's final form before it would get submitted to Taskcluster.
 In fact, if we hadn't made up the trust domain and worker pool in
 ``config.yml``, you could even copy / paste this definition into Taskcluster's
 `task creator`_!
 
 Next you can check out the :doc:`connecting-taskcluster` tutorial or learn more
-about :doc:`generating the taskgraph locally </howto/debugging_and_testing>`.
+about :doc:`generating the taskgraph locally </howto/run-locally>`.
 
 .. _task creator: https://firefox-ci-tc.services.mozilla.com/tasks/create