[Chronik] Add an in-memory bloom filter to GroupHistoryWriter
DraftPublic
Actions

Authored by tobias_ruck on Feb 16 2024, 11:51.

Details

Reviewers

Fabien

Group Reviewers

Restricted Project

Summary

In benchmarks, GroupHistoryWriter is still the slowest part of Chronik. The bottleneck, surprisingly, is still figuring out how many txs a given script has. We added a column family for the number of txs of a script, but it's not enough, because it fails us for the most common case: No script history at all.

This diff tackles exactly this common case, by using a bloom filter to store whether a script has any tx history at all yet. Since a bloom filter cannot have false negatives, if it returns that a script is not in the bloom filter, we know for sure that its number of txs is 0, and don't even have to read from the DB.

In benchmarks, this significantly speeds up indexing; especially if combined with an in-memory LRU cache for the number of txs in the script history, and makes GroupHistoryWriter one of the fastest parts of the indexer.

Test Plan

ninja check-functional

Diff Detail

Repository

rABC Bitcoin ABC

Branch

chronik-optimize-group-history-bloom

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 27181
Build 53928: Build Diff	build-diff · lint-circular-dependencies · chronik-client-integration-tests · build-without-wallet · build-debug · build-chronik-plugins · build-clang · build-chronik · build-clang-tidy
Build 53927: arc lint + arc unit