Page MenuHomePhabricator

Rework receive buffer pushback
ClosedPublic

Authored by PiRK on Thu, Nov 14, 15:23.

Details

Reviewers
Fabien
Group Reviewers
Restricted Project
Commits
rABC94d6cf737131: Rework receive buffer pushback
Summary

It has been observed that if both sides of a P2P connection have a significant amount of data to send, a stall can occur, where both try to drain their own send queue before trying to receive. The same issue seems to apply to the current Bitcoin Core codebase, though I don't know whether it's a frequent issue for us.

The core issue is that whenever our optimistic send fails to fully send a message, we do subsequently not even select() for receiving; if it then turns out that sending is not possible either, no progress is made at all. To address this, the solution used in this PR is to still select() for both sending and receiving when an optimistic send fails, but skip receiving if sending succeeded, and (still) doesn't fully drain the send queue.

This is a significant reduction in how aggressive the "receive pushback" mechanism is, because now it will only mildly push back while sending progress is made; if the other side stops receiving entirely, the pushback disappears. I don't think that's a serious problem though:

  • We still have a pushback mechanism at the application buffer level (when the application receive buffer overflows, receiving is paused until messages in the buffer get processed; waiting on our own net_processing thread, not on the remote party).
  • There are cases where the existing mechanism is too aggressive; e.g. when the send queue is non-empty, but tiny, and can be sent with a single send() call. In that case, I think we'd prefer to still receive within the same processing loop of the network thread.

Co-authored-by: Anthony Towns <aj@erisian.com.au>

core#28287:

rpc: add test-only sendmsgtopeer rpc

This rpc can be used when we want a node to send a message, but
cannot use a python P2P object, for example for testing of low-level
net transport behavior.

test: add basic tests for sendmsgtopeer to rpc_net.py

test: add functional test for deadlock situation

This is a backport of core#27981 and core#28287

Test Plan

ninja all check-all

Test that this passes now and the same functional test sometimes fails before this commit (>33% failure rate on my machine)
for i in {1..100}; do ~/dev/bitcoin-abc/build/test/functional/test_runner.py p2p_net_deadlock; done

Event Timeline

Tail of the build log:

[391/575] Building CXX object src/CMakeFiles/bitcoin-tx.dir/bitcoin-tx.cpp.o
[392/575] Linking CXX executable src/bitcoin-tx
[393/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/blockindex.cpp.o
[394/575] Building CXX object src/CMakeFiles/bitcoin-wallet.dir/bitcoin-wallet.cpp.o
[395/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/logging.cpp.o
[396/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/coins.cpp.o
[397/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/str.cpp.o
[398/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/random.cpp.o
[399/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/blockfilter.cpp.o
[400/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/transaction_utils.cpp.o
[401/575] Building CXX object src/CMakeFiles/server.dir/rpc/net.cpp.o
[402/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/txmempool.cpp.o
[403/575] Building CXX object src/CMakeFiles/server.dir/rpc/mining.cpp.o
[404/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/validation.cpp.o
[405/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/context.cpp.o
[406/575] Building CXX object src/CMakeFiles/server.dir/torcontrol.cpp.o
[407/575] Building CXX object src/CMakeFiles/server.dir/net_processing.cpp.o
[408/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/coinselection.cpp.o
[409/575] Building CXX object src/CMakeFiles/server.dir/rpc/avalanche.cpp.o
[410/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/bdb.cpp.o
[411/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/coincontrol.cpp.o
[412/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/crypter.cpp.o
[413/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/wallet.cpp.o
[414/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/db.cpp.o
[415/575] Building CXX object src/CMakeFiles/server.dir/wallet/init.cpp.o
[416/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/mining.cpp.o
[417/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/net.cpp.o
[418/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/fees.cpp.o
[419/575] Building CXX object src/CMakeFiles/server.dir/rpc/blockchain.cpp.o
[420/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/receive.cpp.o
[421/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/load.cpp.o
[422/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/rpc/signmessage.cpp.o
[423/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/rpc/encrypt.cpp.o
[424/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/interfaces.cpp.o
[425/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/rpc/util.cpp.o
[426/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/rpc/backup.cpp.o
[427/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/salvage.cpp.o
[428/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/sqlite.cpp.o
[429/575] Building CXX object src/test/CMakeFiles/testutil.dir/util/setup_common.cpp.o
[430/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/scriptpubkeyman.cpp.o
[431/575] Building CXX object src/seeder/CMakeFiles/seeder-base.dir/dns.cpp.o
[432/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/transaction.cpp.o
[433/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/rpcwallet.cpp.o
[434/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/walletutil.cpp.o
[435/575] Building CXX object src/iguana/CMakeFiles/iguana.dir/iguana_formatter.cpp.o
[436/575] Building CXX object src/seeder/CMakeFiles/seeder-base.dir/options.cpp.o
[437/575] Building CXX object src/iguana/CMakeFiles/iguana.dir/iguana_interpreter.cpp.o
[438/575] Building CXX object src/iguana/CMakeFiles/iguana.dir/iguana.cpp.o
[439/575] Linking CXX executable src/iguana/iguana
[440/575] Building CXX object src/wallet/CMakeFiles/wallet-tool.dir/wallettool.cpp.o
[441/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/walletdb.cpp.o
[442/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/spend.cpp.o
[443/575] Building CXX object src/seeder/CMakeFiles/seeder-base.dir/db.cpp.o
[444/575] Building CXX object src/wallet/CMakeFiles/wallet.dir/wallet.cpp.o
[445/575] Linking CXX static library src/wallet/libwallet.a
[446/575] Building CXX object src/CMakeFiles/server.dir/validation.cpp.o
[447/575] Building CXX object src/seeder/CMakeFiles/bitcoin-seeder.dir/main.cpp.o
[448/575] Building CXX object src/seeder/CMakeFiles/seeder-base.dir/bitcoin.cpp.o
ninja: build stopped: cannot make progress due to previous errors.
Build build-clang-tidy failed with exit code 1
PiRK published this revision for review.Mon, Nov 18, 11:16
Fabien requested changes to this revision.Mon, Nov 18, 19:49
Fabien added a subscriber: Fabien.

Can you please squash D17139 ? I don't think it makes sense to have the bugfix and the test be separate diffs.

This revision now requires changes to proceed.Mon, Nov 18, 19:49
PiRK edited the summary of this revision. (Show Details)
PiRK edited the test plan for this revision. (Show Details)

squash with tests

Fabien added inline comments.
test/functional/p2p_net_deadlock.py
1

This is the first time I see this pattern... It will likely not play nicely with out update tool

This revision is now accepted and ready to land.Tue, Nov 19, 14:21
This revision was automatically updated to reflect the committed changes.