This currently is one of the slower parts of the indexer, but it's a natural fit for merge_cf, by simply using a prefix for insertion or deletion.
However, one issue is cleaning up empty entries of scripts that are fully spent. This is possible by using RocksDB's "compaction filters", which allows us to add an extra filter before "compaction" occurs, which is when key/value pairs in the DB down the LSM tree.
We add such a compaction filter for when the UTXO set of a member item is empty. However, currently, the "empty UTXOs" is not the empty bytestring but a serialized value of the empty vector. To avoid having to deserialize the UTXO set of a member item again in the compaction filter, we change the serialization to db_serialize_vec from db_serialize, which will give us the empty bytestring for an empty vector, which makes the compaction filter a very simple operation.
It requires a new DB version, but since Chronik has been released already, we add an automatic upgrade mechanism. We can keep the upgrade code, when there's a version 12 we can upgrade from 10 to 11 if necessary and then from 11 to 12.
Benchmarks show a good speedup, syncing the first 300000ish blocks with only TxWriter, prepare_indexed_txs and ScriptUtxoWriter gives us a speedup of the utxo indexing of around 2.5:
bench | total time [s] | time utxos only [s] |
master | 6881.50 | 3202.27 |
no utxos | 3679.23 | 0 |
this diff | 4928.56 | 1249.33 |