Bug 1497769 [wpt PR 13447] - [wptrunner] Discard corrupted message queues, a=testonly
authorjugglinmike <mike@mikepennisi.com>
Thu, 11 Oct 2018 10:04:16 +0000
changeset 489299 693b05802d989054d34e7ede25ae2f3b0443007d
parent 489298 6867ffea8e8058b9ed7503c38a13eabc56bf3204
child 489300 54091f6d7ec7d753b14a7220c6d282090d4d9d48
push id247
push userfmarier@mozilla.com
push dateSat, 27 Oct 2018 01:06:44 +0000
reviewerstestonly
bugs1497769, 13447, 13446
milestone64.0a1
Bug 1497769 [wpt PR 13447] - [wptrunner] Discard corrupted message queues, a=testonly Automatic update from web-platform-tests[wptrunner] Discard corrupted message queues (#13447) "TestRunner" sub-processes forward their standard output streams to the "TestRunnerManager" process via a Python multiprocessing Queue. When such a process produces a large amount of output (e.g. in failing WebDriver specification tests), the data may be buffered in the underlying operating system pipe. In this state, such a process will not exit naturally: > Bear in mind that a process that has put items in a queue will wait > before terminating until all the buffered items are fed by the > "feeder" thread to the underlying pipe. [1] Previously, the TestRunnerManager forcibly terminated the sub-process and re-used the message queue, providing it to a new sub-process and waiting for new items to be inserted. However, the queue's behavior is unpredictable in this state. It has been observed to block indefinitely on GNU/Linux and macOS systems [2]. To avoid this behavior, discard the queue and create a new instance for use in subsequent tests. [1] https://docs.python.org/2/library/multiprocessing.html#all-platforms [2] https://github.com/web-platform-tests/wpt/issues/13446 -- wpt-commits: f6bca7b6218f591edc1bcb87c9ab0837ca41970b wpt-pr: 13447
testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py
--- a/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py
+++ b/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py
@@ -688,17 +688,26 @@ class TestRunnerManager(threading.Thread
 
         self.logger.debug("waiting for runner process to end")
         self.test_runner_proc.join(10)
         self.logger.debug("After join")
         if self.test_runner_proc.is_alive():
             # This might leak a file handle from the queue
             self.logger.warning("Forcibly terminating runner process")
             self.test_runner_proc.terminate()
-            self.test_runner_proc.join(10)
+
+            # Multiprocessing queues are backed by operating system pipes. If
+            # the pipe in the child process had buffered data at the time of
+            # forced termination, the queue is no longer in a usable state
+            # (subsequent attempts to retrieve items may block indefinitely).
+            # Discard the potentially-corrupted queue and create a new one.
+            self.command_queue.close()
+            self.command_queue = Queue()
+            self.remote_queue.close()
+            self.remote_queue = Queue()
         else:
             self.logger.debug("Runner process exited with code %i" % self.test_runner_proc.exitcode)
 
     def runner_teardown(self):
         self.ensure_runner_stopped()
         return RunnerManagerState.stop()
 
     def send_message(self, command, *args):