Currently, when a functional test fails, the debug logs are printed sequentially to the travis log. This makes debugging race conditions based on the travis log hard. Instead, all logs events should be combined and sorted by their timestamp, then appended to the travis log.
This diff replaces the PYTHON_DEBUG environment variable with a proper
--combinedlogslen argument that let the user choose how many log lines
to combine and print in case of failure. The intent is to help debugging
eventual races between tests when run on CI.
I tried to keep this backport as close as possible from the PR despite
the divergences in the test_runner.py code.
Backport of core PR11789
https://github.com/bitcoin/bitcoin/pull/11789/files