Page MenuHomePhabricator

[Chronik] Optimize `GroupUtxoWriter` using merge ops and compaction filters
ClosedPublic

Authored by tobias_ruck on Feb 27 2024, 15:22.

Details

Summary

This currently is one of the slower parts of the indexer, but it's a natural fit for merge_cf, by simply using a prefix for insertion or deletion.

However, one issue is cleaning up empty entries of scripts that are fully spent. This is possible by using RocksDB's "compaction filters", which allows us to add an extra filter before "compaction" occurs, which is when key/value pairs in the DB down the LSM tree.

We add such a compaction filter for when the UTXO set of a member item is empty. However, currently, the "empty UTXOs" is not the empty bytestring but a serialized value of the empty vector. To avoid having to deserialize the UTXO set of a member item again in the compaction filter, we change the serialization to db_serialize_vec from db_serialize, which will give us the empty bytestring for an empty vector, which makes the compaction filter a very simple operation.

It requires a new DB version, but since Chronik has been released already, we add an automatic upgrade mechanism. We can keep the upgrade code, when there's a version 12 we can upgrade from 10 to 11 if necessary and then from 11 to 12.

Benchmarks show a good speedup, syncing the first 300000ish blocks with only TxWriter, prepare_indexed_txs and ScriptUtxoWriter gives us a speedup of the utxo indexing of around 2.5:

benchtotal time [s]time utxos only [s]
master6881.503202.27
no utxos3679.230
this diff4928.561249.33
Test Plan

Test indexing

ninja check-functional

Database upgrade

  1. Build bitcoind on master
  2. Run it for a few blocks (e.g. 100000), with -chronik
  3. Stop the node
  4. Build bitcoind on this diff
  5. Running the node on the same datadir will now upgrade the database and then continue syncing normally:
2024-02-27T14:11:10Z Chronik has version 11, DB has version 10
2024-02-27T14:11:10Z Upgrading Chronik DB from version 10 to 11...
2024-02-27T14:11:10Z Upgrading Chronik UTXO set for script_utxo. Do not kill the process during upgrade, it will corrupt the database.
2024-02-27T14:11:10Z Upgraded 0 of 9428 (estimated)
2024-02-27T14:11:10Z Upgraded 10000 of 9428 (estimated)
2024-02-27T14:11:10Z Upgraded 20000 of 9428 (estimated)
2024-02-27T14:11:10Z Upgraded 30000 of 9428 (estimated)
2024-02-27T14:11:10Z Upgraded 40000 of 9428 (estimated)
2024-02-27T14:11:10Z Upgraded 50000 of 9428 (estimated)
2024-02-27T14:11:10Z Upgrade for script_utxo complete
2024-02-27T14:11:10Z Upgrading Chronik UTXO set for token_id_utxo. Do not kill the process during upgrade, it will corrupt the database.
2024-02-27T14:11:10Z Upgrade for token_id_utxo complete
2024-02-27T14:11:10Z Successfully upgraded Chronik DB from version 10 to 11.

Note that the UTXO set size is based on what RocksDB estimates for us, and may be off (as in my example).

Outdated database

Running the master version again now gives us the expected error: Chronik outdated: Chronik has version 10, but the database has version 11. Upgrade your node to the appropriate version

Diff Detail

Repository
rABC Bitcoin ABC
Branch
chronik-optimize-utxos-merge
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 27468
Build 54499: Build Diffbuild-chronik · build-chronik-plugins · chronik-client-integration-tests
Build 54498: arc lint + arc unit

Event Timeline

tobias_ruck edited the test plan for this revision. (Show Details)
tobias_ruck added inline comments.
chronik/chronik-db/src/io/group_utxos.rs
339

This may be quite spammy, should we set it to 100k or even 1M?

Fabien added inline comments.
chronik/chronik-db/src/io/group_utxos.rs
339

I think it's fine, less than 100 lines is fine

This revision is now accepted and ready to land.Feb 28 2024, 08:48
chronik/chronik-db/src/io/group_utxos.rs
339

If there's a 100000000 scripts in the db in there it'll be 10000 lines

chronik/chronik-db/src/io/group_utxos.rs
339

Stupid me I was counting blocks...
Let's keep the value, it's once for an upgrade and if it's really too spammy we can always adjust later