Details

Reviewers

Group Reviewers

Restricted Project

Commits

rSTAGING6434a8f4417a: Merge #13105: [qa] Add --failfast option to functional test runner
rABC6434a8f4417a: Merge #13105: [qa] Add --failfast option to functional test runner

Summary

58f9a0a Use --failfast when running functional tests on Travis (James O'Beirne)
bf720c1 Add --failfast option to functional test runner (James O'Beirne)

Pull request description:

Add the option (`--failfast`) to stop the functional test runner's execution when it encounters the first failure.

Also cleans up run_test's arguments list ([no more mutable default for `args`](http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-default-arguments)) and call site.

Tree-SHA512: e854b1b1634bf613ae8ae88e715df1460982fa68db9d785aafeb5eccf5bf324c7f20dded2ca6840ebf18a28347ecac2138d6c7592507b34939b02609ef55e1b3

Partial backport of Core PR13105

This backport only includes some of the --failfast behavior since our test_runner has diverged so much.
This patch implements --failfast in a way that does not kill currently running tests like the original PR does.
This is due to a limitation of the thread-level parallelization that we implemented in test_runner.

Though the documentation on these matters is extremely scattered, I will attempt to sum up my conclusions:

Threading implementation (current) + modifications that match this backport:

Attempting to kill running tests either fails completely or leaves child bitcoind processes running. My understanding is this is due to a limitation in the way Python handles threads that spawn processes.

Multiprocessing implementation:

I attempted a full migration from Threading to Multiprocessing to alleviate the above.
Despite documentation indicating that killing a Multiprocess will also kill it's children, this behavior does not appear to work as intended. Some example issues that have persisted even in recent Python versions that might be related: https://bugs.python.org/issue9205 https://bugs.python.org/issue22393 While it's difficult to say exactly which of these issues is related to the root cause, it appears that we will not find a stable version of Python that supports our Multiprocessing needs in the near future. It also looks like I'm not the only one with such frustration: https://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool#comment101399084_1408476 (note the timeline of the entire conversation: 2010 - 2019)
Despite getting a mostly-working implementation using various workarounds, I ran into all sorts of signal passing issues where SIGKILL, SIGTERM, and SIGINT would not behave as expected when these signals originated from worker processes. This was especially annoying, as it effectively had the same behavior as the Threading implementation when you attempt to Ctrl+C test_runner: leaving dangling bitcoind processes for an unknown amount of time after test_runner had quit.

After much experimentation, it appears there is no stable way to kill the currently running tests with widely
available Python versions (my current is 3.7.3). Even if we could get something to work, it is clear to me
that the maintenance cost of doing so just to pull in --failfast is not worth it. We should expect the
default to be that the test suite passes, so I'm not concerned with a small performance hit on --failfast.

It also appears that Core's test_runner runs tests in parallel now, so doing a migration to a more up-to-date
version of their test_runner is likely our best course of action in the long run.

Pulling in this patch sooner rather than later will help improve UX of our future land bot when test failures
occur by reducing patch queuing time.

Test Plan

Add assert True == False to some test(s).

test_runner.py
test_runner.py --failfast

Remove the assert from those tests and re-run the above for sanity.

Diff Detail

Repository

rABC Bitcoin ABC

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jasonbcox created this revision.Apr 15 2020, 21:37

Herald added a reviewer: Restricted Project. · View Herald TranscriptApr 15 2020, 21:37

[Bot Message]
One or more PR numbers were detected in the summary.
Links to those PRs have been inserted into the summary for reference.

Build Bitcoin-ABC / Diffs / Diff Testing (build-diff) started.

Remove unnecessary newline

Build Bitcoin-ABC / Diffs / Diff Testing (build-diff) started.

Fix formatting weirdness not caught by linter

Build Bitcoin-ABC / Diffs / Diff Testing (build-diff) started.

Build Bitcoin-ABC / Diffs / Diff Testing (build-diff) passed.

Harbormaster completed remote builds in B10238: Diff 18842.Apr 15 2020, 21:55

Build Bitcoin-ABC / Diffs / Diff Testing (build-diff) passed.

Harbormaster completed remote builds in B10239: Diff 18843.Apr 15 2020, 22:03

Build Bitcoin-ABC / Diffs / Diff Testing (build-diff) passed.

Harbormaster completed remote builds in B10240: Diff 18844.Apr 15 2020, 22:18

deadalnix requested changes to this revision.Apr 16 2020, 13:29

deadalnix added a subscriber: deadalnix.

deadalnix added inline comments.

test/functional/test_runner.py
449 ↗	(On Diff #18844)	This doesn't strike me as a very good way to do this. You should simply raise a flag and let other tasks deal with the flag. This function is managing the test itself, and unless you want to scatter the work queue management scattered all over the place.