HomePhabricator

[avalanche] Fix a lock order inversion

Description

[avalanche] Fix a lock order inversion

Summary:
And reduce the locking/unlocking of the peer manager in the processor event loop.
There is a potential deadlock due to a lock order inversion:
cs_peerManager -> cs_main -> cs_vNodes -> cs_peerManager

  • The processor holds cs_peerManager when the tip is updated
  • The peer manager proof registration holds cs_main to check the proof validity (cs_peerManager -> cs_main)
  • When a new block is mined, the NewPoWValidBlock holds cs_main and takes cs_vNodes to loop over all the peers (cs_main -> cs_vNodes)
  • The processor event loop holds cs_vNodes (via connman->ForNode()) then takes the peer manager lock (cs_vNodes ->cs_peerManager)

This diff fixes the potential deadlock by consistently locking cs_peerManager before cs_vNodes. This has the added benefit to reduce (by 3x)) the number of peermanager lock/unlock in the event loop.

Test Plan:
Build with TSAN enabled, then:

for i in {1..100}; do TSAN_OPTIONS="second_deadlock_stack=1:suppressions=<path/to>/bitcoin-abc/test/sanitizer_suppressions/tsan" ./test/functional/test_runner.py abc_rpc_getavalancheinfo.py || break; done

Before this patch the test will fail about 50% of the time.

Reviewers: #bitcoin_abc, sdulfari

Reviewed By: #bitcoin_abc, sdulfari

Differential Revision: https://reviews.bitcoinabc.org/D11735

Details

Provenance
FabienAuthored on Jul 11 2022, 13:41
FabienPushed on Jul 11 2022, 18:50
Reviewer
Restricted Project
Differential Revision
D11735: [avalanche] Fix a lock order inversion
Parents
rABC8374ab043f5c: [Chronik] Remove `dead_code`, `elided_lifetimes_in_paths` and…
Branches
Unknown
Tags
Unknown