Page MenuHomePhabricator

[avalanche] Fix a data race where updating availability scores can coincide with initialization of the state container
AbandonedPublic

Authored by sdulfari on Oct 20 2022, 20:04.

Details

Reviewers
Fabien
Group Reviewers
Restricted Project
Summary

There is a data race where UpdateAvalancheStatistics() may be called while
receiving an AVAHELLO message, where m_avalanche_state is initialized.
A similar race exists if addavalanchenode is called during
UpdateAvalancheStatistics().

This patch fixes the race and also fixes the design flaw that enabled it. The
design flaw resulted in:

  • No lock around the m_avalanche_state structure since it resided within the structure itself.
  • Initialization of m_avalanche_state on receipt of AVAHELLO instead of first use (attempt to poll the peer for the first time), resulting in unused memory initialized for poll-only peers.
  • Unclear intent of m_avalanche_state since it was also used to determine whether a peer had avalanche enabled.

With this patch, the intent is now more clear and the state is only initialized
for peers where it will be used.

Depends on D12325

Test Plan
ninja check check-functional

Diff Detail

Event Timeline

Fabien requested changes to this revision.Oct 21 2022, 08:30
Fabien added a subscriber: Fabien.

I think a better approach is to remove the state altogether, or make it a POD.

  • keep the cs_statistics with a nice comment explaining how it's used only for logical block locks over several variables
  • update GetAvalailabilityScore() to return 0 if m_avalanche_enabled is false (or use an optional, but not sure it's better in this case)

This was not possible before and will greatly simplify the code.

src/avalanche/processor.cpp
818

That's trivially wrong, you access it above

test/functional/abc_p2p_getavaaddr.py
378

Why ?

This revision now requires changes to proceed.Oct 21 2022, 08:30

Redid this patch per feedback. New one is here: D12355