HomePhabricator

[Chronik] Add `TxReader` and `TxWriter`

Description

[Chronik] Add TxReader and TxWriter

Summary:
Allows us to store txs from the node. This currently doesn't add txs coming from the node, just adds the structs/methods to do so.

Instead of having txids be the keys for the column family containing the txs, we use a 64-bit serial number "TxNum" that increments with every transaction in block order. This allows us to e.g. very easily iterate over all the txs in a block just by knowing the first tx_num of the block. It also simplifies the address index (especially reduces space requirements), as we simply store a list of relatively small integers instead of txids.

64-bits allows us to store a maximum of 18446744073709551616 txs, which even at 1M tx/s would be enough for +500000 years.

We only store the txid, data_pos, undo_pos and time_first_seen in the DB, everything else we can read from the block/undo files. We use the fact that coinbase txs don't have undo data, and undo data for txs never is at position 0, so we set undo_pos = 0 for coinbase txs, and treat every entry with undo_pos == 0 as a coinbase tx.

Just as we do for blocks, for the reverse index txid -> tx_num, we use ReverseLookup (which actually makes much more sense for txs). We use a 64-bit cheap hash to make collisions difficult. For the future, it might be worthwhile to salt/seed the cheap hash with some number generated when indexing for the first time, just to prevent someone spamming txs with lots of collisions.

Incrementing CURRENT_INDEXER_VERSION is a technicality, as we add the column families to the DB, but don't add data from the node (yet), so using it later could result in an inconsistent state.

Depends on D13458.

Test Plan: ninja check-crates

Reviewers: Fabien, #bitcoin_abc

Reviewed By: Fabien, #bitcoin_abc

Differential Revision: https://reviews.bitcoinabc.org/D13437

Event Timeline