Page MenuHomePhabricator

[avalanche] Fix a lock order inversion
ClosedPublic

Authored by Fabien on Jul 11 2022, 13:48.

Details

Reviewers
sdulfari
Group Reviewers
Restricted Project
Commits
rABCf02d22dc206a: [avalanche] Fix a lock order inversion
Summary

And reduce the locking/unlocking of the peer manager in the processor event loop.
There is a potential deadlock due to a lock order inversion:
cs_peerManager -> cs_main -> cs_vNodes -> cs_peerManager

  • The processor holds cs_peerManager when the tip is updated
  • The peer manager proof registration holds cs_main to check the proof validity (cs_peerManager -> cs_main)
  • When a new block is mined, the NewPoWValidBlock holds cs_main and takes cs_vNodes to loop over all the peers (cs_main -> cs_vNodes)
  • The processor event loop holds cs_vNodes (via connman->ForNode()) then takes the peer manager lock (cs_vNodes ->cs_peerManager)

This diff fixes the potential deadlock by consistently locking cs_peerManager before cs_vNodes. This has the added benefit to reduce (by 3x)) the number of peermanager lock/unlock in the event loop.

Test Plan

Build with TSAN enabled, then:

for i in {1..100}; do TSAN_OPTIONS="second_deadlock_stack=1:suppressions=<path/to>/bitcoin-abc/test/sanitizer_suppressions/tsan" ./test/functional/test_runner.py abc_rpc_getavalancheinfo.py || break; done

Before this patch the test will fail about 50% of the time.

Diff Detail

Repository
rABC Bitcoin ABC
Lint
Lint Not Applicable
Unit
Tests Not Applicable