Page MenuHomePhabricator

more optimal virtualsize accounting with increments
Changes PlannedPublic

Authored by markblundeberg on Jan 13 2020, 05:08.

Details

Reviewers
deadalnix
Group Reviewers
Restricted Project
Summary

After D4906/D4903, it turns out that all places using GetTxVirtualSize() and
GetVirtualSizeWithDescendants() are better if given the incremental
virtual sizes that would be calculated including ancestor information as
a baseline. The increment can be smaller than if we ignored that baseline,
because of the nonlinearity of the VirtualSize function. This is a
distinction that was not relevant before now. By taking advantage of the
distinction we can get more optimal package handling in cases where the
transactions have highly variable sigops density.

For example, consider a case of three transactions, where txC spends
txB output which in turn spends txA output.

txA size=100, sigops=10, fee=10000
txB size=100, sigops=0,  fee=100
txC size=200, sigops=2,  fee=700

We can mine either {txA}, {txA, txB}, or {txA, txB, txC}. The
virtualsizes (50 bytes/sigop) and fees for these packages are:

500, 500, 600
10000, 10100, 10800

Note that including txB is 'free', because it is riding on a sigop-heavy
ancestor so the virtualsize is sigop-dominated. Including txC is also
partially free -- only its sigops count matters. The total feerates for
these packages (sat/vbyte) are:

20, 20.2, 18

So {txA, txB} is definitely the first choice for mining. But prior
to this patch, that wouldn't be chosen. Here are the changes in
ancestor scoring with this patch:

txB: 20.2 sat/vbyte   [was 1 sat/vbyte]
txA: 20   sat/vbyte   [was 20 sat/vbyte]
txC:  7   sat/vbyte   [was 3.5 sat/vbyte]

i.e. the mining will first choose txB (with txA), then later on will
choose txC.

Likewise, if the mempool is full then we can evict either {txC},
{txB, txC}, or {txA, txB, txC}. Again there is no reason to evict
txB before evicting txA, and the new descendant scoring will give:

txC:  7  sat/vbyte    [was 3.5 sat/vbyte]
txA: 20  sat/vbyte    [was 20 sat/vbyte]
txB: inf sat/vbyte    [was 2.6 sat/vbyte]

i.e., the mempool will first evict txC, then later evict txA (with txB).

Depends on D4906 and D4903

Test Plan

ninja check-all

Diff Detail

Repository
rABC Bitcoin ABC
Branch
virtualsize_incremental
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 9084
Build 16130: Default Diff Build & Tests
Build 16129: arc lint + arc unit

Event Timeline

This could be integrated into the mentioned Diffs, but I figured the mathematical concepts here are a bit weird and deserve a separate think.

(the other diffs are not badly wrong without this, just nonoptimal)

Or should it be called GetVirtualSizeIncrement instead of GetIncrementalVirtualSize ?

src/miner.cpp
72 ↗(On Diff #15382)

So far I follow. The rest of this patch seems very confusing to me and I don't understand what problem it solves.

src/txmempool.h
154 ↗(On Diff #15382)

Why not WithAncestor? This API seems confusing to me.

155 ↗(On Diff #15382)

Can you explain why this is wanted?

deadalnix requested changes to this revision.Jan 13 2020, 16:01

Ok after a few pass I now get it. The validation.cpp change is not good, IMO, the the rest is pretty solid but would need some polish.

src/txmempool.h
154 ↗(On Diff #15382)

Ok So I dug a bit more and I understand this better. I'm still convinced that this is not a good API.

What you effectively want is GetIncrementalVirtualSize and GetIncrementalVirtualSizeWithDescendants . The rest is not something you want to expose in the API, or it simply looks lieka mixed bag of thing ou expose or not withut really any logic behind it.

GetVirtualSizeWithAncestorsAndDescendants specifically has one call site, which is a good indicator that this is not something that is generally useful and is just adding indirections in the code for no good reasons. Just inline it.

GetVirtualSizeOnlyAncestors can be renamed to something less confusing, such as GetVirtualSizeOfAncestors, only kinda imply that there is more to it, like a more complete version that don't do only that.You can also make it private and declare it at the end of the class as to not mix it up with the API that is intended for consumption.

You would generally benefit from clearer grouping in the API. For instance, GetVirtualSizeOfAncestors needs to come immediately before/after GetVirtualSizeWithAncestors.

src/validation.cpp
689 ↗(On Diff #15382)

This is probably not what you want to use here. It creates a Parent pays for child situation which, while more common IRL than child pay for parent, make things fairly unpredictable for wallets.

This revision now requires changes to proceed.Jan 13 2020, 16:01
markblundeberg added inline comments.
src/validation.cpp
689 ↗(On Diff #15382)

Just to make sure we're on the same page, do you agree there is no change in behaviour here compared to before?

Indeed if we were to use the actual incremental size (with UpdateEntryForAncestors data) it would create a PPFC parent-pays-for-child situation.

I don't think that would hurt wallets since it would be strictly more permissive, and wallets ought not to rely on PPFC. I think it's just simpler leaving it like written, without PPFC.

deadalnix requested changes to this revision.Jan 19 2020, 16:51
deadalnix added inline comments.
src/validation.cpp
689 ↗(On Diff #15382)

It create a situation where a transaction can stop being acceptable in the mempool once a block is mined. This is a no go.

This is a problem for wallet as the cannot predict the fee they need to pay, as this fee can go UP after a block is mined - which they can't possibly know ahead of time and can even happen concurrently to the tx being broadcasted.

The virtual size of the tx must be used to compute minimum fee.

This revision now requires changes to proceed.Jan 19 2020, 16:51

rebase; tweak ATMP virtualsize computation (same result)

src/txmempool.cpp
69 ↗(On Diff #15700)

I spent a fair amount of time verifying that these actually compute what I'd expect. It doesn't look like there are unit tests specifically for these. Even though some test are indirectly using them, it would be beneficial to add a unittest that verifies that the value actually are what we expect.

Note that doing it so before this refactoring and then changing the test to adapt to this refactoring would provide a strong guarantee that it is correct.