Bug 1132771 - Add Files to moz.build with ability to define Bugzilla component draft
authorGregory Szorc <gps@mozilla.com>
Wed, 25 Feb 2015 18:09:20 -0800
changeset 246104 8c6ff59a920caee2a26c131de4e9789ae56ee64d
parent 246103 fc2b82383481a73f7f3b39a15be6cc2411eb2564
child 246105 ffcbd348623c83b72194d21b44f91f60f0f4a3ba
push id818
push usergszorc@mozilla.com
push dateThu, 26 Feb 2015 02:28:11 +0000
Bug 1132771 - Add Files to moz.build with ability to define Bugzilla component The Files sub-context allows us to attach metadata to files based on pattern matching rules. Patterns are matched against files in a last-write-wins fashion. The sub-context defines the BUG_COMPONENT variable, which is a 2-tuple (actually a named tuple) defining the Bugzilla product and component for files. There are no consumers yet. But an eventual use case will be to suggest a bug component for a patch/commit. Another will be to automatically suggest a bug component for a failing test.
new file mode 100644
--- /dev/null
+++ b/build/docs/files-metadata.rst
@@ -0,0 +1,172 @@
+.. _mozbuild_files_metadata:
+Files Metadata
+:ref:`mozbuild-files` provide a mechanism for attaching metadata to
+files. Essentially, you define some flags to set on a file or file
+pattern. Later, some tool or process queries for metadata attached to a
+file of interest and it does something intelligent with that data.
+Defining Metadata
+Files metadata is defined by utilizing the
+:ref:`Files Sub-Context <mozbuild_subcontext_Files>` in ``moz.build``
+files. e.g.::
+    with Files('**/Makefile.in'):
+        BUG_COMPONENT = ('Core', 'Build Config')
+This working example says, *for all Makefile.in files, set the Bugzilla
+component to Core :: Build Config*.
+For more info, read the
+:ref:`docs on Files <mozbuild_subcontext_Files>`.
+How Metadata is Read
+``Files`` metadata is extracted in :ref:`mozbuild_fs_reading_mode`.
+Reading starts by specifying a set of files whose metadata you are
+interested in. For each file, the filesystem is walked to the root
+of the source directory. Any ``moz.build`` encountered during this
+walking are marked as relevant to the file.
+Let's say you have the following filesystem content::
+   /moz.build
+   /root_file
+   /dir1/moz.build
+   /dir1/foo
+   /dir1/subdir1/foo
+   /dir2/foo
+For ``/root_file``, the relevant ``moz.build`` files are just
+For ``/dir1/foo`` and ``/dir1/subdir1/foo``, the relevant files are
+``/moz.build`` and ``/dir1/moz.build``.
+For ``/dir2``, the relevant file is just ``/moz.build``.
+Once the list of relevant ``moz.build`` files is obtained, each
+``moz.build`` file is evaluated. Root ``moz.build`` file first,
+leaf-most files last. This follows the rules of
+:ref:`mozbuild_fs_reading_mode`, with the set of evaluated ``moz.build``
+files being controlled by filesystem content, not ``DIRS`` variables.
+The file whose metadata is being resolved maps to a set of ``moz.build``
+files which in turn evaluates to a list of
+:py:class:`mozbuild.frontend.context.Context`` instances. For
+file metadata, we only care about one ``Context`` type:
+:py:class:`mozbuild.frontend.context.Files` (which is a
+We start with an empty ``Files`` instance to represent the file. As
+we encounter a *files sub-context*, we see if it is appropriate to
+this file. If it is, we apply its values. This process is repeated
+until all *files sub-contexts* have been applied or skipped. The final
+state of the ``Files`` instance is used to represent the metadata for
+this particular file.
+It may help to visualize this. Say we have 2 ``moz.build`` files::
+    # /moz.build
+    with Files('*.cpp'):
+        BUG_COMPONENT = ('Core', 'XPCOM')
+    with Files('**/*.js'):
+        BUG_COMPONENT = ('Firefox', 'General')
+    # /foo/moz.build
+    with Files('*.js'):
+        BUG_COMPONENT = ('Another', 'Component')
+Querying for metadata for the file ``/foo/test.js`` will reveal 3
+relevant ``Files`` sub-contexts. They are evaluated as follows:
+1. ``/moz.build - Files('*.cpp')``. Does ``/*.cpp`` match
+   ``/foo/test.js``? **No**. Ignore this context.
+2. ``/moz.build - Files('**/*.js')``. Does ``/**/*.js`` match
+   ``/foo/test.js``? **Yes**. Apply ``BUG_COMPONENT = ('Firefox', 'General')``
+   to us.
+3. ``/foo/moz.build - Files('*.js')``. Does ``/foo/*.js`` match
+   ``/foo/test.js``? **Yes**. Apply
+   ``BUG_COMPONENT = ('Another', 'Component')``.
+At the end of execution, we have
+``BUG_COMPONENT = ('Another', 'Component')`` as the metadata for
+One way to look at file metadata is as a stack of data structures.
+Each ``Files`` sub-context relevant to a given file is applied on top
+of the previous state, starting from an empty state. The final state
+.. _mozbuild_files_metadata_finalizing:
+Finalizing Values
+The default behavior of ``Files`` sub-context evaluation is to apply new
+values on top of old. In most circumstances, this results in desired
+behavior. However, there are circumstances where this may not be
+desired. There is thus a mechanism to *finalize* or *freeze* values.
+Finalizing values is useful for scenarios where you want to prevent
+wildcard matches from overwriting previously-set values. This is useful
+for one-off files.
+Let's take ``Makefile.in`` files as an example. The build system module
+policy dictates that ``Makefile.in`` files are part of the ``Build
+Config`` module and should be reviewed by peers of that module. However,
+there exist ``Makefile.in`` files in many directories in the source
+tree. Without finalization, a ``*`` or ``**`` wildcard matching rule
+would match ``Makefile.in`` files and overwrite their metadata.
+Finalizing of values is performed by setting the ``FINAL`` variable
+on ``Files`` sub-contexts. See the
+:ref:`Files documentation <mozbuild_subcontext_Files>` for more.
+Here is an example with ``Makefile.in`` files, showing how it is
+possible to finalize the ``BUG_COMPONENT`` module.::
+    # /moz.build
+    with Files('**/Makefile.in'):
+        BUG_COMPONENT = ('Core', 'Build Config')
+        FINAL += ['BUG_COMPONENT']
+    # /foo/moz.build
+    with Files('**'):
+        BUG_COMPONENT = ('Another', 'Component')
+If we query for metadata of ``/foo/Makefile.in``, both ``Files``
+sub-contexts match the file pattern. However, since ``BUG_COMPONENT`` is
+marked as finalized by ``/moz.build``, the assignment from
+``/foo/moz.build`` is ignored. The final value for ``BUG_COMPONENT``
+is ``('Core', 'Build Config')``.
+Another use case for finalizing is one-off files. For example::
+    with Files('*.cpp'):
+        BUG_COMPONENT = ('One-Off', 'For C++')
+        FINAL += ['BUG_COMPONENT']
+    with Files('**'):
+        BUG_COMPONENT = ('Regular', 'Component')
+For every files except ``foo.cpp``, the bug component will be resolved
+as ``Regular :: Component``. However, ``foo.cpp`` has its value of
+``One-Off :: For C++`` preserved because it is finalized.
+Guidelines for Defining Metadata
+In general, values defined towards the root of the source tree are
+generic and become more specific towards the leaves. For example,
+the ``BUG_COMPONENT`` for ``/browser`` might be ``Firefox :: General``
+whereas ``/browser/components/preferences`` would list
+``Firefox :: Preferences``.
--- a/build/docs/index.rst
+++ b/build/docs/index.rst
@@ -8,16 +8,17 @@ Important Concepts
    :maxdepth: 1
    Mozconfig Files <mozconfigs>
+   files-metadata
    Profile Guided Optimization <pgo>
--- a/build/docs/mozbuild-files.rst
+++ b/build/docs/mozbuild-files.rst
@@ -25,16 +25,18 @@ different, it's doubtful most ``moz.buil
 error if executed by a vanilla Python interpreter (e.g. ``python
 The following properties make execution of ``moz.build`` files special:
 1. The execution environment exposes a limited subset of Python.
 2. There is a special set of global symbols and an enforced naming
    convention of symbols.
+3. Some symbols are inherited from previously-executed ``moz.build``
+   files.
 The limited subset of Python is actually an extremely limited subset.
 Only a few symbols from ``__builtins__`` are exposed. These include
 ``True``, ``False``, and ``None``. Global functions like ``import``,
 ``print``, and ``open`` aren't available. Without these, ``moz.build``
 files can do very little. *This is by design*.
 The execution sandbox treats all ``UPPERCASE`` variables specially. Any
@@ -63,26 +65,87 @@ data structures in this module are consu
 the sandbox. There are tests to ensure that the set of symbols exposed
 to an empty sandbox are all defined in the ``context`` module.
 This module also contains documentation for each symbol, so nothing can
 sneak into the sandbox without being explicitly defined and documented.
 Reading and Traversing moz.build Files
-The process responsible for reading ``moz.build`` files simply starts at
-a root ``moz.build`` file, processes it, emits the globals namespace to
-a consumer, and then proceeds to process additional referenced
-``moz.build`` files from the original file. The consumer then examines
-the globals/``UPPERCASE`` variables set as part of execution and then
-converts the data therein to Python class instances.
+The process for reading ``moz.build`` files roughly consists of:
+1. Start at the root ``moz.build`` (``<topsrcdir>/moz.build``)
+2. Evaluate the ``moz.build`` file in a new sandbox
+3. Emit the main *context* and any *sub-contexts* from the executed
+   sandbox
+4. Extract a set of ``moz.build`` files to execute next.
+5. For each additional ``moz.build`` file, goto #2 and repeat until all
+   referenced files have executed.
+From the perspective of the consumer, the output of reading is a stream
+of :py:class:`mozbuild.frontend.reader.context.Context` instances. Each
+``Context`` defines a particular aspect of data. Consumers iterate over
+these objects and do something with the data inside. Each object is
+essentially a dictionary of all the ``UPPERCASE`` variables populated
+during its execution.
+.. note::
+   Historically, there was only one ``context`` per ``moz.build`` file.
+   As the number of things tracked by ``moz.build`` files grew and more
+   and more complex processing was desired, it was necessary to split these
+   contexts into multiple logical parts. It is now common to emit
+   multiple contexts per ``moz.build`` file.
+Build System Reading Mode
+The traditional mode of evaluation of ``moz.build`` files is what's
+called *build system traversal mode.* In this mode, the ``CONFIG``
+variable in each ``moz.build`` sandbox is populated from data coming
+from ``config.status``, which is produced by ``configure``.
-The executed Python sandbox is essentially represented as a dictionary
-of all the special ``UPPERCASE`` variables populated during its
+During evaluation, ``moz.build`` files often make decisions conditional
+on the state of the build configuration. e.g. *only compile foo.cpp if
+feature X is enabled*.
+In this mode, traversal of ``moz.build`` files is governed by variables
+with ``DIRS`` in them. For example, to execute a child directory,
+``foo``, you would add ``DIRS += ['foo']`` to a ``moz.build`` file and
+``foo/moz.build`` would be evaluated.
+.. _mozbuild_fs_reading_mode:
+Filesystem Reading Mode
+There is an alternative reading mode that doesn't involve the build
+system and doesn't utilize ``DIRS`` variables to control traversal into
+child directories. This mode is called *filesystem reading mode*.
+In this reading mode, the ``CONFIG`` variable is a dummy, mostly empty
+object. Accessing all but a few special variables will return an empty
+value. This means that nearly all ``if CONFIG['FOO']:`` branches will
+not be taken.
+Instead of utilizing content from within the evaluated ``moz.build``
+file to drive traversal into subsequent ``moz.build`` files, the set
+of files to evaluate is controlled by the thing doing the reading.
+A single ``moz.build`` file is not guaranteed to be executable in
+isolation. Instead, we must evaluate all *parent* ``moz.build`` files
+first. For example, in order to evaluate ``/foo/moz.build``, one must
+execute ``/moz.build`` and have its state influence the execution of
+Filesystem reading mode is utilized to power the
+:ref:`mozbuild_files_metadata` feature.
+Technical Details
 The code for reading ``moz.build`` files lives in
 :py:mod:`mozbuild.frontend.reader`. The Python sandboxes evaluation results
 (:py:class:`mozbuild.frontend.context.Context`) are passed into
 :py:mod:`mozbuild.frontend.emitter`, which converts them to classes defined
 in :py:mod:`mozbuild.frontend.data`. Each class in this module defines a
 domain-specific component of tree metdata. e.g. there will be separate
 classes that represent a JavaScript file vs a compiled C++ file or test
@@ -95,19 +158,16 @@ each. Depending on the content of the ``
 object derived or 100.
 The purpose of the ``emitter`` layer between low-level sandbox execution
 and metadata representation is to facilitate a unified normalization and
 verification step. There are multiple downstream consumers of the
 ``moz.build``-derived data and many will perform the same actions. This
 logic can be complicated, so we have a component dedicated to it.
-Other Notes
 :py:class:`mozbuild.frontend.reader.BuildReader`` and
 :py:class:`mozbuild.frontend.reader.TreeMetadataEmitter`` have a
 stream-based API courtesy of generators. When you hook them up properly,
 the :py:mod:`mozbuild.frontend.data` classes are emitted before all
 ``moz.build`` files have been read. This means that downstream errors
 are raised soon after sandbox execution.
 Lots of the code for evaluating Python sandboxes is applicable to
--- a/python/mozbuild/mozbuild/frontend/context.py
+++ b/python/mozbuild/mozbuild/frontend/context.py
@@ -14,28 +14,28 @@ If you are looking for the absolute auth
 contain, you've come to the right place.
 from __future__ import unicode_literals
 import os
 from collections import OrderedDict
-from contextlib import contextmanager
 from mozbuild.util import (
+    TypedNamedTuple,
 import mozpack.path as mozpath
 from types import FunctionType
 from UserString import UserString
 import itertools
@@ -391,25 +391,105 @@ def ContextDerivedTypedList(type, base_c
                 def __new__(cls, obj):
                     return type(context, obj)
             self.TYPE = _Type
             super(_TypedList, self).__init__(iterable)
     return _TypedList
+BugzillaComponent = TypedNamedTuple('BugzillaComponent',
+                        [('product', unicode), ('component', unicode)])
+class Files(SubContext):
+    """Metadata attached to files or directories.
+    It is common to want to annotate files or directories with metadata, such
+    as which Bugzilla component tracks issues with certain files. This context is
+    where we stick that metadata.
+    The argument to this sub-context is a file matching pattern that is applied
+    against the host file's directory. If the pattern matches a file whose info
+    is currently being sought, the metadata attached to this instance will be
+    applied to that file. e.g. a pattern of ``foo.html`` will match exactly
+    the ``foo.html`` file in the current directory. ``*.jsm`` will match all
+    ``.jsm`` files in the current directory. ``**/*.cpp`` will match all
+    ``.cpp`` files in this and all child directories.
+    """
+    VARIABLES = {
+        'BUG_COMPONENT': (BugzillaComponent, tuple,
+            """The bug component that tracks changes to these files.
+            Values are a 2-tuple of unicode describing the Bugzilla product and
+            component. e.g. ``('Core', 'Build Config')``.
+            """, None),
+        'FINAL': (TypedList(unicode), list,
+            """A list of variables whose values are "frozen".
+            During normal processing, values from newer Files contexts
+            overwrite previously set values. Last write wins. This behavior is
+            not always desired. ``FINAL`` provides a mechanism to prevent
+            further updates to a variable.
+            When a variable name is assigned to ``FINAL``, the value of that
+            variable is frozen and subsequent writes to it are ignored.
+            See :ref:`mozbuild_files_metadata_finalizing` for more info.
+            """, None),
+    }
+    def __init__(self, parent, pattern=None):
+        super(Files, self).__init__(parent)
+        self.pattern = pattern
+    def __iadd__(self, other):
+        assert isinstance(other, Files)
+        for k, v in other.items():
+            # Ignore updates to finalized flags.
+            if k in self['FINAL']:
+                continue
+            # Finalized flags should be unioned.
+            if k == 'FINAL':
+                for ov in other['FINAL']:
+                    if ov not in self['FINAL']:
+                        self['FINAL'].append(ov)
+                continue
+            self[k] = v
+        return self
+    def asdict(self):
+        """Return this instance as a dict with built-in data structures.
+        Call this to obtain an object suitable for serializing.
+        """
+        d = {}
+        if 'BUG_COMPONENT' in self:
+            bc = self['BUG_COMPONENT']
+            d['bug_component'] = (bc.product, bc.component)
+        return d
 # This defines functions that create sub-contexts.
 # Values are classes that are SubContexts. The class name will be turned into
 # a function that when called emits an instance of that class.
 # Arbitrary arguments can be passed to the class constructor. The first
 # argument is always the parent context. It is up to each class to perform
 # argument validation.
+    Files,
 for cls in SUBCONTEXTS:
     if not issubclass(cls, SubContext):
         raise ValueError('SUBCONTEXTS entry not a SubContext class: %s' % cls)
     if not hasattr(cls, 'VARIABLES'):
         raise ValueError('SUBCONTEXTS entry does not have VARIABLES: %s' % cls)
--- a/python/mozbuild/mozbuild/frontend/reader.py
+++ b/python/mozbuild/mozbuild/frontend/reader.py
@@ -57,16 +57,17 @@ from .sandbox import (
 from .context import (
+    Files,
@@ -1228,8 +1229,45 @@ class BuildReader(object):
         result = {}
         for path, paths in path_mozbuilds.items():
             result[path] = reduce(lambda x, y: x + y, (contexts[p] for p in paths), [])
         return result, all_contexts
+    def get_metadata_for_files(self, paths):
+        """Obtain metadata for a set of files.
+        Given a set of input paths, determine which moz.build files may
+        define metadata for them, evaluate those moz.build files, and
+        apply file metadata rules defined within to determine metadata
+        values for each file requested.
+        Essentially, for each input path:
+        1. Determine the set of moz.build files relevant to that file by
+           looking for moz.build files in ancestor directories.
+        2. Evaluate moz.build files starting with the most distant.
+        3. Iterate over Files sub-contexts.
+        4. If the file pattern matches the file we're seeking info on,
+           apply attribute updates.
+        5. Return the most recent value of attributes.
+        """
+        paths, _ = self.read_relevant_mozbuilds(paths)
+        r = {}
+        for path, ctxs in paths.items():
+            flags = Files(Context())
+            for ctx in ctxs:
+                if not isinstance(ctx, Files):
+                    continue
+                relpath = mozpath.relpath(path, ctx.relsrcdir)
+                if mozpath.match(relpath, ctx.pattern):
+                    flags += ctx
+            r[path] = flags
+        return r
new file mode 100644
--- /dev/null
+++ b/python/mozbuild/mozbuild/test/frontend/data/files-metadata/bug_component/bad-assignment/moz.build
@@ -0,0 +1,2 @@
+with Files('*'):
+    BUG_COMPONENT = 'bad value'
new file mode 100644
--- /dev/null
+++ b/python/mozbuild/mozbuild/test/frontend/data/files-metadata/bug_component/different-matchers/moz.build
@@ -0,0 +1,4 @@
+with Files('*.jsm'):
+    BUG_COMPONENT = ('Firefox', 'JS')
+with Files('*.cpp'):
+    BUG_COMPONENT = ('Firefox', 'C++')
new file mode 100644
--- /dev/null
+++ b/python/mozbuild/mozbuild/test/frontend/data/files-metadata/bug_component/final/moz.build
@@ -0,0 +1,3 @@
+with Files('**/Makefile.in'):
+    BUG_COMPONENT = ('Core', 'Build Config')
new file mode 100644
--- /dev/null
+++ b/python/mozbuild/mozbuild/test/frontend/data/files-metadata/bug_component/final/subcomponent/moz.build
@@ -0,0 +1,2 @@
+with Files('**'):
+    BUG_COMPONENT = ('Another', 'Component')
new file mode 100644
--- /dev/null
+++ b/python/mozbuild/mozbuild/test/frontend/data/files-metadata/bug_component/moz.build
@@ -0,0 +1,2 @@
+with Files('**'):
+    BUG_COMPONENT = ('default_product', 'default_component')
new file mode 100644
--- /dev/null
+++ b/python/mozbuild/mozbuild/test/frontend/data/files-metadata/bug_component/simple/moz.build
@@ -0,0 +1,2 @@
+with Files('*'):
+    BUG_COMPONENT = ('Core', 'Build Config')
new file mode 100644
--- a/python/mozbuild/mozbuild/test/frontend/test_reader.py
+++ b/python/mozbuild/mozbuild/test/frontend/test_reader.py
@@ -5,16 +5,17 @@
 from __future__ import unicode_literals
 import os
 import sys
 import unittest
 from mozunit import main
+from mozbuild.frontend.context import BugzillaComponent
 from mozbuild.frontend.reader import BuildReaderError
 from mozbuild.frontend.reader import BuildReader
 from mozbuild.test.common import MockConfig
 import mozpack.path as mozpath
@@ -308,11 +309,61 @@ class TestBuildReader(unittest.TestCase)
         self.assertEqual([ctx.relsrcdir for ctx in paths['d1/every-level/a/file']],
             ['', 'd1', 'd1/every-level', 'd1/every-level/a'])
         self.assertEqual([ctx.relsrcdir for ctx in paths['d1/every-level/b/file']],
             ['', 'd1', 'd1/every-level', 'd1/every-level/b'])
         self.assertEqual([ctx.relsrcdir for ctx in paths['d2/file']],
             ['', 'd2'])
+    def test_files_bad_bug_component(self):
+        reader = self.reader('files-metadata')
+        with self.assertRaises(BuildReaderError):
+            reader.get_metadata_for_files(['bug_component/bad-assignment/moz.build'])
+    def test_files_bug_component_simple(self):
+        reader = self.reader('files-metadata')
+        v = reader.get_metadata_for_files(['bug_component/simple/moz.build'])
+        self.assertEqual(len(v), 1)
+        flags = v['bug_component/simple/moz.build']
+        self.assertEqual(flags['BUG_COMPONENT'].product, 'Core')
+        self.assertEqual(flags['BUG_COMPONENT'].component, 'Build Config')
+    def test_files_bug_component_different_matchers(self):
+        reader = self.reader('files-metadata')
+        v = reader.get_metadata_for_files([
+            'bug_component/different-matchers/foo.jsm',
+            'bug_component/different-matchers/bar.cpp',
+            'bug_component/different-matchers/baz.misc'])
+        self.assertEqual(len(v), 3)
+        js_flags = v['bug_component/different-matchers/foo.jsm']
+        cpp_flags = v['bug_component/different-matchers/bar.cpp']
+        misc_flags = v['bug_component/different-matchers/baz.misc']
+        self.assertEqual(js_flags['BUG_COMPONENT'], BugzillaComponent('Firefox', 'JS'))
+        self.assertEqual(cpp_flags['BUG_COMPONENT'], BugzillaComponent('Firefox', 'C++'))
+        self.assertEqual(misc_flags['BUG_COMPONENT'], BugzillaComponent('default_product', 'default_component'))
+    def test_files_bug_component_final(self):
+        reader = self.reader('files-metadata')
+        v = reader.get_metadata_for_files([
+            'bug_component/final/foo',
+            'bug_component/final/Makefile.in',
+            'bug_component/final/subcomponent/Makefile.in',
+            'bug_component/final/subcomponent/bar'])
+        self.assertEqual(v['bug_component/final/foo']['BUG_COMPONENT'],
+            BugzillaComponent('default_product', 'default_component'))
+        self.assertEqual(v['bug_component/final/Makefile.in']['BUG_COMPONENT'],
+            BugzillaComponent('Core', 'Build Config'))
+        self.assertEqual(v['bug_component/final/subcomponent/Makefile.in']['BUG_COMPONENT'],
+            BugzillaComponent('Core', 'Build Config'))
+        self.assertEqual(v['bug_component/final/subcomponent/bar']['BUG_COMPONENT'],
+            BugzillaComponent('Another', 'Component'))
 if __name__ == '__main__':