Page MenuHomePhabricator

add sigChecks value to script cache
Needs RevisionPublic

Authored by markblundeberg on Sun, Jan 5, 02:54.


Group Reviewers
Restricted Project
Maniphest Tasks
T704: sigChecks implementation

Currently the script cache is purely setlike, however for sigChecks we will need
to remember a transaction's sigChecks count in cache for a few reasons:

  • enforcement of a block-level limit means that ConnectBlock needs this info, and since it relies on script cache for speedup, the data must be available in cache.
  • when resurrecting txns back into the mempool (after a reorg), we use the script cache but again we need to get the sigchecks count back once again, in order to save it in the mempool structures. (block template construction needs sigchecks count, to satisfy the block level consensus rule)

This turns the script cache into a map with a slightly shorter key (256 -> 240 bits)
and a small value (16 bits) attached for tracking sigchecks.

The stored sigchecks value range of 0-65535 is more than enough for our purposes,
since after sigchecks activation, we will have a standardness rule (the per-input
sigchecks limit in D4617) that ultimately limits density of sigchecks, and since
there is a maximum standard txn size, the most sigchecks possible in a standard
transaction will be ~3030 (~30300 on testnet). Nonstandard transactions can
exceed this greatly (theoretically up to 5 million sigchecks), but we don't
care about them here since they never get inserted into the cache anyway.

(this uses the new cuckoocache map feature from D4621, D4643, D4641 )

Test Plan

ninja check

Diff Detail

rABC Bitcoin ABC
Lint OK
No Unit Test Coverage
Build Status
Buildable 9047
Build 16056: Bitcoin ABC Buildbot
Build 16055: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
markblundeberg added inline comments.Wed, Jan 8, 07:46
44 ↗(On Diff #15155)

Hmm, I interpret it as an API requirement of cuckoocache :-D

(anyway this struct is purely local to this cpp so I'm not bothered by it).

49 ↗(On Diff #15155)

Hmmm ok.... So, what is the advantage of having a 32 byte element rather than 34 byte?

1301 ↗(On Diff #15155)

Yeah I know what you mean, but in the long term this seemingly bad value is actually what I intend to have in the cache for the preactivation flagset.

Reason: before activation, someone can make a monster standard but slow tx with 100000 signature checks, and we definitely want that guy to be cached. So for the preactivation flags all tx including the monster will be cached with sigchecks=0. For postactivation flags they will have accurate sigchecks but that monster tx will be rejected anyway by the standard input-sigcheck limiter from D4617, so it won't even get to this point.

markblundeberg marked 7 inline comments as done.


deadalnix requested changes to this revision.Wed, Jan 8, 16:53
deadalnix added inline comments.
36 ↗(On Diff #15216)

You get so many assertions because you don't use proper types.

71 ↗(On Diff #15216)

You can multiply the result by a constant such as 11400714819323198486 ( 2^64 / phi ) and take the highest 32bits of the result so you get a good mix for cheap.

106 ↗(On Diff #15216)

Out of scope of this patch, but this API is actually really bad. Finalize should return an uint256.

119 ↗(On Diff #15216)

I'm not sure we want to ensure any value for nSigChecksRet when the value isn't in the cache, so you can simply remove that branch. Less branches = better code.

128 ↗(On Diff #15216)

This check is redundant if you use a uint16_t

54 ↗(On Diff #15216)

It seems like you should be using a uint16_t in the API. Alternatively, storing 32bits in the cache also is an option, though I'm not convinced we shoudl ever supports txns with more than 65k sigops :)

480 ↗(On Diff #15216)

If you really are going that road, I would advise to check for the boundary conditions as SCRIPT_CACHE_MAX_SIGCHECKS and SCRIPT_CACHE_MAX_SIGCHECKS + 1 in addition of a few randomly picked values. This is better than magic constant as it makes it clear what the code is actually doing (I wrote another comment before figuring out I was wrong and write this one).

Magic constant in the code are almost always a bad idea.

484 ↗(On Diff #15216)

You'd benefit encapsulating each of these tests and make the variable local. This code is more brittle than it needs to be.

Using a new freshly initialized int every time ensures that no spooky action at a distance is possible.

Maybe you do not want a new one for every test here, but you probably don't want to use the same one as line 465 or line 496.

497 ↗(On Diff #15216)

Is this the behavior we want? I did not ask myself the question, but it seems like a gotcha, no? Maybe we are just completely fucked if it ever comes to this anyways.

498 ↗(On Diff #15216)

You'll note that due to the call line 491, this may not be testing what you think it is. You can argue it's obvious from that patch and I'd agree, but the point is that it's fragile, as this may not remain obvious or even true going forward.

For instance a simple reordering of the previous test would look harmless when reviewed, yet may cause this test to become 100% moot.

This revision now requires changes to proceed.Wed, Jan 8, 16:53
deadalnix added inline comments.Wed, Jan 8, 17:17
49 ↗(On Diff #15155)

You can have way more of them in a given amount of memory. The entropy we have in there is actually 32bytes, so we want to make sure we don't use more.

110 ↗(On Diff #15155)

Regular assert are fine. size() returns a constant so the compiler will be able to remove all of this anyways.

130 ↗(On Diff #15155)

I see. They definitely have a point, but it seems like putting bubble wrap around the atrocity they have created rather than anything else :) Some of the argument they use are a bit bogus, as int also wrap around, except in completely unpredictable ways. We should probably get to a point where we use -Wconversion, but the codebase clearly ain't ready.

What's funny is the compiler will 100% transform this into unsigned(nSigChecks) >= SCRIPT_CACHE_MAX_SIGCHECKS

Another thing you want to consider is that the sigops count in metrics is public, which causes anyone to be able to play with it in way you can't predict. But in practice it is only incremented and no complex math operations are done on it. Even it it came to be the case, you could convert to an int then to do the computations.

void registerSigOps(uint16_t count = 1) {
    uint16_t oldCount = nSigCheck;
    nSigCheck += count;
    if (nSigCheck < oldCount) {
        throw std::range_error("sigchecks out of range");

And voila.

19 ↗(On Diff #15155)

Very good call.

1301 ↗(On Diff #15155)

Even then, why not put the right value in there? I get that it doesn't need to be put in there in this very patch, but it going to be computed anyway.

markblundeberg marked an inline comment as done.Thu, Jan 9, 00:29
markblundeberg added inline comments.
130 ↗(On Diff #15155)

No such transformation happens -- both sides are int.

For the sigops thing being struct-public, I don't think it matters much since the whole cache implementation details are local to this CPP file. Callers don't even have a way of determining a pointer to any Element object/member since the cache is static and nothing returns references.

markblundeberg marked an inline comment as done.Thu, Jan 9, 00:44
markblundeberg added inline comments.
1301 ↗(On Diff #15155)

Hmm how do you mean -- what's the right value?

markblundeberg added a comment.EditedSun, Jan 12, 06:45

Hmm weird I just noticed a bunch of these comments now (the first block) ... is it possible they didn't show up before? Anyway... back to work...

markblundeberg marked 9 inline comments as done.Mon, Jan 13, 07:55
markblundeberg added inline comments.
71 ↗(On Diff #15216)

That would actually end up throwing away one of the 'good' bits (the highest bit in u, 2**63) since 11400714819323198486 * 2**63 is a multiple of 2**64.

106 ↗(On Diff #15216)

I dunno, all that stuff under crypto/ (where CSHA256 lives) seems to be pretty independent of the bitcoin constructs. You could perhaps add a wrapper under hash.h.

128 ↗(On Diff #15216)

Yeah to be clear, I don't want to expose the internal storage format via a type here, for a few reasons:

  • Don't want silent conversion happening in caller when squishing down to a small type, rather I really want the exception.
  • The next thing we add to script cache might be a bool, in which case it would be totally legitimate to use a bitfield, trimming the ScriptCacheElement's sigchecks field be only 15 bits long. I definitely want caller to expect that SCRIPT_CACHE_MAX_SIGCHECKS may have any value, not just a type limit.

Alternately we can make AddKeyInScriptCache just return a bool false if it's unable to store the requested value, which frees the caller from having to perform range checks.

497 ↗(On Diff #15216)

Yeah that was my thinking, we'd be screwed either way. :D

markblundeberg marked 4 inline comments as done.
markblundeberg edited the summary of this revision. (Show Details)

update tests

I am thinking of making two changes here:

  1. switch AddKeyInScriptCache to return bool regarding insertion success, instead of exception.
  2. instead of passing SigChecks int, pass a ScriptExecutionMetrics.
deadalnix requested changes to this revision.Mon, Jan 13, 14:50

I think 2 is an excellent idea!

1 not so much, because people forget to check return codes, and this is not a path you are expected to hit in the code (one could say that it is an path you only exceptionally take).

45 ↗(On Diff #15386)

format \o/

I won't block for this if this is isolated, but make sure you pay attention to this kind of details as to avoid needless rounds of review.

This revision now requires changes to proceed.Mon, Jan 13, 14:50
markblundeberg updated this revision to Diff 15433.EditedTue, Jan 14, 08:34

big overhaul to use ScriptExecutionMetrics

Note that if we ever add more values in the future, we might start drawing the distinction between cached and uncached values of ScriptExecutionMetrics.

  • cached values: important block-level metrics for consensus, or perhaps some info that we really need to know in AcceptToMemoryPoolWorker.
  • uncached values: such values would be only examined in VerifyScript (input-level check) or in CheckInputs (tx-level check, standardness only) on a pass/fail basis. They would never be able to reliably get out of CheckInputs since sometimes it will only return what it knows from cache.
deadalnix requested changes to this revision.Wed, Jan 15, 14:18

So this is mostly in good shape, but ScriptExecutionMetrics has a bit of an identity crisis which causes a worst of both world situation. I would recommand giving up on trying to make it a POD, make its innards private, manipulate them via member function that ensure consistency and remove a bunch of error conditions down the road.

20 ↗(On Diff #15433)

I don't understand what's the prupose of this considering the struct is a pod. Or is the constructor make ti a non pod?

In any are, you got to chose if you go the pod road or want to encapsulate. I'd go for the former as this would allow you to detect errors at the point where they are "created" (an ScriptExecutionMetrics with an invalid state was created) rather than at the point the error is actually bad (the ScriptExecutionMetrics is tentatively cached, but cannot).

109 ↗(On Diff #15433)

Why are the first few bytes removed as opposed to the last?

std::copy(hash.begin(), hash.begin() + data.size(), data.begin());

This seems easier, no?

56 ↗(On Diff #15433)

Is this intended to be part of the public API?

441 ↗(On Diff #15433)

xor the value with the expected value, so you know it even works if you pass in 0x12345678

474 ↗(On Diff #15433)

You can skip this part of the test by committing to a pod

509 ↗(On Diff #15433)

You can skip this whole part of the test and lot of complication in the implementation by committing to a non pod.

1309 ↗(On Diff #15433)

You can remove all this by committing to a non pod.

This revision now requires changes to proceed.Wed, Jan 15, 14:18
markblundeberg added inline comments.Thu, Jan 16, 01:13
20 ↗(On Diff #15433)

Yes I prefer the former (PoD) as well.

The main reason I'm adding these members is so that code using ScriptExecutionMetrics does not start silently being buggy, if it turns out that we need to add another value:

  • Unlike struct initializer list (where you can leave off arguments), using the constructor forces you to provide the correct number of arguments, i.e., if we add another value to this struct then all the callsites creating ScriptExecutionMetrics with specific values will turn into compile errors until they provide the new value.
  • Likewise operator== makes sure that you're doing a proper comparison of all values; operator+= ensures that all metrics are being accumulated.

Besides that, I want access via PoD semantics.

markblundeberg added inline comments.Thu, Jan 16, 01:14
109 ↗(On Diff #15433)

Heh yeah, that's leftover behaviour from replicating the example in cuckoocache test :D

markblundeberg added inline comments.Thu, Jan 16, 01:19
1309 ↗(On Diff #15433)


markblundeberg added inline comments.Thu, Jan 16, 01:24
56 ↗(On Diff #15433)

Yes. See where it is used in this diff.

474 ↗(On Diff #15433)

?? I'm not sure how ScriptCacheKey's pod/nonpodness makes a difference. I think it's good to check the keys are distinct and that they are self-equal.

markblundeberg marked 6 inline comments as done.Sat, Jan 18, 04:48
markblundeberg added inline comments.
20 ↗(On Diff #15433)

Just to add to this, I mean I want neither a strict PoD type (bare struct) nor a fully encapsulated class. Is there no middle ground allowed here?

509 ↗(On Diff #15433)

Not sure what you mean...

rebase ; comments ; tweak to not use uint256

deadalnix requested changes to this revision.Sat, Jan 18, 23:51

From what I gather, it is fairly obvious that you actually don't want a but are trying to make things work as a pod. The problem when you distribute responsibilities is that you always end up with something inconsistent down the road or a huge pile of spaghetti because various pieces of code needs to be kept in sync.

20 ↗(On Diff #15433)

If you are concerned about semantic breaking when you change what's in it, it is that you don't want a POD. I don't think a middle ground makes any sense. Ether the struct itself is responsible for it's innards, or the code that uses it is. It cannot be both.

56 ↗(On Diff #15433)

It seems like this is exclusively used in the test, and if value mismatch in the test, the test will break anyways, so I'm not sure. DO youe xpect your user to be aware of this value and check against it? If so, then why SCRIPT_CACHE_MAX_SIGCHECKS ?

Who's responsibility is it to check?

509 ↗(On Diff #15433)

That is the metric class is not a pod, it can ensure that the value can always be cached, removing a whole bunch of error conditions. all over the place.

1309 ↗(On Diff #15433)

By having it ensure the value it has are in the proper range.

This revision now requires changes to proceed.Sat, Jan 18, 23:51
markblundeberg added inline comments.Sun, Jan 19, 00:43
509 ↗(On Diff #15433)

As discussed in the other diff D4941, the metric class in some edge cases may need to support values > 65535, at least up to 5 million and more if the script opcode count limit is increased in future. So we can either:

  • Use more bytes in cache, and remove error conditions.
  • Keep the error conditions and maintain a distinction between the general purpose usage/arithmetic type and the (space-efficient) private cache storage type.

I think the first option is a bit sad since we are trying to be economical in the cache and those bytes will always be 0 in regular operation, at least with current rules.

deadalnix added inline comments.Sun, Jan 19, 14:01
509 ↗(On Diff #15433)

The cache being of this size assumes that this is sufficient to store all the sigops of one transaction. You can have an accumulator to accumulate across transactions that use larger types internally.

Let just use an int in the cache and not a ScriptExecutionMetric in the cache.