Page MenuHomePhabricator

[avalanche] Fix multiple flakiness issues in abc_p2p_avalanche_proof_voting
AbandonedPublic

Authored by sdulfari on Jul 25 2022, 23:59.

Details

Reviewers
Fabien
Group Reviewers
Restricted Project
Summary

This piece of code faces the following issues/challenges:

  1. The expected debug log line may be emitted while wait_until times out. Successive retries miss the log line because assert_debug_log does not look into the past.
  2. lambda: not self.can_find_proof_in_poll(...) is not expected to return immediately once the proof has finalized/stalled, making #1 worse.
  3. The retry loop may retry for reasons unrelated to its intended purpose.
  4. In the stalled proof check, the timeout is insufficient for TSAN.

This patch fixes all of these issues by replacing the complex retry loop with a dumb for loop that resides inside of the assert_debug_context. This ensures that the log will always be captured. The tradeoff is multiple calls to can_find_proof_in_poll, which gives a small hit to test timing (a few seconds in total).

Test Plan
for I in {0..10} ; do ./test/functional/test_runner.py abc_p2p_avalanche_proof_voting ; done

Diff Detail

Repository
rABC Bitcoin ABC
Branch
apv-flaky-poll-responses
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 19721
Build 39161: Build Diffbuild-diff · build-without-wallet · build-debug · build-clang-tidy · build-clang
Build 39160: arc lint + arc unit

Event Timeline

Fabien requested changes to this revision.Jul 26 2022, 09:31
Fabien added a subscriber: Fabien.
Fabien added inline comments.
test/functional/abc_p2p_avalanche_proof_voting.py
92

That doesn't solve the issue, just make it less likely to occur. We need a better way to achieve this, and we just added a finalized proof count to the RPC so better use that instead so we get rid of the log check entirely.

This revision now requires changes to proceed.Jul 26 2022, 09:31

You are right. This is the better solution: D11802