Page MenuHomePhabricator

[chronik-client] Set mocha timeout based on rpc timeout
AbandonedPublicDraft

Authored by bytesofman on Feb 26 2024, 23:31.

Details

Reviewers
None
Group Reviewers
Restricted Project
Summary

Troubleshooting CI failures in cc mocha tests.

Mocha can timeout before the python setup script is complete. In this case, mocha timeout will not kill hte py process.

Ensure the child process is killed at the end of all tests in a file.

Test Plan

CI or see D14915 for build instructions

Event Timeline

clean up logs, use chronikUrl throughout instead of awaiting every time

The build failed due to an unexpected infrastructure outage. The administrators have been notified to investigate. Sorry for the inconvenience.
The build failed due to an unexpected infrastructure outage. The administrators have been notified to investigate. Sorry for the inconvenience.
The build failed due to an unexpected infrastructure outage. The administrators have been notified to investigate. Sorry for the inconvenience.
The build failed due to an unexpected infrastructure outage. The administrators have been notified to investigate. Sorry for the inconvenience.

Better handling of timeouts and process termination

we do not need to bump the global timeout

no need to change test kill log

better type defn of setTimeout

I get the idea: there is no timeout callback on mocha so we define our own that will trigger prior to mocha. This is clever, as there is probably no much we can do with mocha itself.

Here are the steps that can be done to achieve this:

  1. Factor the chronik message in all the tests so we don't repeat. This makes later changes very simple as they will impact a single file.
  2. Change the chronik message to test_info and add the timeout, configure mocha to use this timeout.
  3. Add the new timeout from the message, and bump the mocha timeout far in the future so we never hit it (or disable it if that's possible).
modules/chronik-client/test/integration/block_and_blocks.ts
61 ↗(On Diff #45648)

This is probably not enough of a margin (you need time for the child process to get killed), but I get the idea

69 ↗(On Diff #45648)

We still want an error if a timeout occurs, so exit(1) or something like that.
Otherwise if the timeout occurs the CI will return no error.

test/functional/setup_scripts/chronik-client_block_and_blocks.py
25 ↗(On Diff #45648)

you can do a single message with both infos, not sure what the benefit is of having 2 messages

bytesofman marked 3 inline comments as done.

pass test info in single msg, exit in error condition on timeout exit, update margin for timeout exit to 5s and not 100ms

Add the new timeout from the message, and bump the mocha timeout far in the future so we never hit it (or disable it if that's possible).

The timeout set in the test will override the global timeout. We can disable the global timeout by setting it to zero, but imo we should keep something there to handle tests that have not covered it (e.g. npm run test).

Note that the global timeout does cover things like before() in this test, where we have not yet set the test timeout.

modules/chronik-client/package.json
14 ↗(On Diff #45647)

only test this one during draft review

modules/chronik-client/test/integration/block_and_blocks.ts
48 ↗(On Diff #45644)

we use before, which runs before all tests, to get data that will be used in all tests (rpc_timeout and chronik_url). We set the timeout for all tests in this file and the chronikUrl var that will be used by all tests in this file

48 ↗(On Diff #45644)

we need to use async function() {} instead of async () => {} to correctly access mocha's this and set the timeout

this is a to-do for all mocha tests...apparently arrow functions are bad practice because we can't access this. First time I have needed to access it.

https://stackoverflow.com/questions/23492043/change-default-timeout-for-mocha

60 ↗(On Diff #45644)

I could not find a specific way to trigger a callback in the event of a mocha timeout. However, looks like they work the same way as any other mocha failure. e.g. if I set the timeout to 1s and run this test, here's the output:

> chronik-client@0.23.0 integration-tests
> mocha -j1 -r ts-node/register test/integration/block_and_blocks.ts

Starting test_runner for chronik-client_block_and_blocks
Test runner for chronik-client_block_and_blocks started


  /block and /blocks
2024-02-26T23:52:25.607000Z TestFramework (INFO): Initializing test directory /home/joey/github/abc/bitcoin-abc/build/test/tmp/test_runner_₿₵_🏃_20240226_155225/setup_scripts/chronik-client_block_and_blocks_0
    1) "before all" hook for "gives us the block and blocks"
Tests complete, shutting down child process


  0 passing (1s)
  1 failing

  1) /block and /blocks
       "before all" hook for "gives us the block and blocks":
     Error: Timeout of 1210ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/home/joey/github/abc/bitcoin-abc/modules/chronik-client/test/integration/block_and_blocks.ts)
      at listOnTimeout (node:internal/timers:573:17)
      at processTimers (node:internal/timers:514:7)



Setting chronik url to  http://127.0.0.1:30001
rpc_timeout for block_and_blocks: 60
Mocha timeout set to 240s
2024-02-26T23:52:26.722000Z TestFramework (INFO): Passed test setup data to mocha
2024-02-26T23:52:26.722000Z TestFramework (INFO): Received a stop message, exiting
2024-02-26T23:52:26.773000Z TestFramework (INFO): Stopping nodes
2024-02-26T23:52:27.026000Z TestFramework (INFO): Cleaning up /home/joey/github/abc/bitcoin-abc/build/test/tmp/test_runner_₿₵_🏃_20240226_155225/setup_scripts/chronik-client_block_and_blocks_0 on exit
2024-02-26T23:52:27.027000Z TestFramework (INFO): Tests successful
Running Unit Tests for Test Framework Modules
setup_scripts/chronik-client_block_and_blocks.py started
setup_scripts/chronik-client_block_and_blocks.py passed, Duration: 2 s

TEST                                             | STATUS    | DURATION

setup_scripts/chronik-client_block_and_blocks.py | ✓ Passed  | 2 s

ALL                                              | ✓ Passed  | 2 s (accumulated) 
Runtime: 2 s

mb another failure that we get setup_scripts/chronik-client_block_and_blocks.py passed, Duration: 2 s ... but this does confirm the process is killed right?

20

we move this setup, which applies to the whole test, out of before

test/functional/setup_scripts/chronik-client_block_and_blocks.py
39 ↗(On Diff #45644)

In this test specifically, the step numbers are confusingly named. We have a test for the above yield True which tests initial regtest before 10 blocks are mined. Not directly related to this improvement, but since we kind of add a "step 0" with setup, was difficult to not correct this here.

31

we add a zero step to send the timeout (and chronik info), which is used in every test

modules/chronik-client/test/integration/block_and_blocks.ts
60 ↗(On Diff #45644)

If you want your test to be relevant, add:

import time

time.sleep(10) # wait 10 seconds

in the setup script. Make sure to sleep more than the mocha timeout so the script still runs when it occurs.

test/functional/setup_scripts/chronik-client_block_and_blocks.py
27 ↗(On Diff #45666)

OK, this should be automated by the setup framework at some point because it will be identical for all the setups

39 ↗(On Diff #45644)

The step 0 should be moved out from the scripts and moved to the class so it's automated. Something like the setup_nodes or setup_network from the test framework that can be overridden by the child script class if needed. Maybe setup_test_info ?

bytesofman marked 2 inline comments as done.

Add log statement when mocha tests exit due to mocha timeout, add more logic to setup_test_info function

modules/chronik-client/test/integration/block_and_blocks.ts
60 ↗(On Diff #45644)

ok -- running with this, and setting mocharc.js to 3s, get

> chronik-client@0.23.0 integration-tests
> mocha -j1 -r ts-node/register test/integration/block_and_blocks.ts

Starting test_runner for chronik-client_block_and_blocks
Test runner for chronik-client_block_and_blocks started


  /block and /blocks
2024-02-27T17:15:08.871000Z TestFramework (INFO): Initializing test directory /home/joey/github/abc/bitcoin-abc/build/test/tmp/test_runner_₿₵_🏃_20240227_091508/setup_scripts/chronik-client_block_and_blocks_0
    1) "before all" hook for "gives us the block and blocks"
Tests complete, shutting down child process


  0 passing (3s)
  1 failing

  1) /block and /blocks
       "before all" hook for "gives us the block and blocks":
     Error: Timeout of 3000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/home/joey/github/abc/bitcoin-abc/modules/chronik-client/test/integration/block_and_blocks.ts)
      at listOnTimeout (node:internal/timers:573:17)
      at processTimers (node:internal/timers:514:7)



chronikUrl set to http://127.0.0.1:30001
Mocha timeout set to 240 seconds
2024-02-27T17:15:15.098000Z TestFramework (INFO): Passed test setup data to mocha
2024-02-27T17:15:15.098000Z TestFramework (INFO): Received a stop message, exiting
2024-02-27T17:15:15.149000Z TestFramework (INFO): Stopping nodes
Exited test setup after exceeding mocha timeout.

So -- now missing the Cleaning up /home/joey/github/abc/bitcoin-abc/build/test/tmp/test_runner_₿₵_🏃_20240226_155225/setup_scripts/chronik-client_block_and_blocks_0 on exit statement -- does this mean we are not exiting as intended?

test/functional/setup_scripts/chronik-client_block_and_blocks.py
39 ↗(On Diff #45644)

I can get it to work by defining it this way at the top of this py script

def setup_test_info(self):
        test_info = {}
        test_info["chronik"] = f"http://127.0.0.1:{self.nodes[0].chronik_port}"
        test_info["setup_script_timeout"] = self.rpc_timeout
        send_ipc_message({"test_info": test_info})
        self.log.info("Passed test setup data to mocha")

If I add yield True to the function (instead of running this after calling the function) -- tests fail (seems like the before() routine in mocha does not properly wait for the statusEvent ready, as it fails due to testInfo being undefined)

1) /block and /blocks
       "before all" hook for "gives us the block and blocks":
     TypeError: Cannot read properties of undefined (reading 'chronik')

I'm not sure how to implement in setup_framework.py class -- seems like this was not done initially because we need node = self.nodes[0] to get the chronik url?

test/functional/setup_scripts/chronik-client_block_and_blocks.py
39 ↗(On Diff #45644)

all tests have at least 1 node, it's fine to use self.nodes[0] and let the script override this if needed

So -- now missing the Cleaning up /home/joey/github/abc/bitcoin-abc/build/test/tmp/test_runner_₿₵_🏃_20240226_155225/setup_scripts/chronik-client_block_and_blocks_0 on exit statement -- does this mean we are not exiting as intended?

That means that we have the logged data not destroyed, which is what we want

move setup_test_info to class

superseded by D15585

modules/chronik-client/test/integration/block_and_blocks.ts
23 ↗(On Diff #45725)

ugh, if I do this for all the tests, then they work one at a time -- but not all together

> chronik-client@0.23.0 integration-tests
> mocha -j1 -r ts-node/register test/integration/*.ts

Starting test_runner for chronik-client_block_and_blocks
Test runner for chronik-client_block_and_blocks started
Starting test_runner for chronik-client_blockchain_info
Test runner for chronik-client_blockchain_info started
Starting test_runner for chronik-client_blocktxs_and_tx_and_rawtx
Test runner for chronik-client_blocktxs_and_tx_and_rawtx started
Starting test_runner for chronik-client_broadcast_txs
Test runner for chronik-client_broadcast_txs started
Test runner for chronik-client_blockchain_info completed with code 1