Page MenuHomePhabricator

[Chronik] Add an in-memory bloom filter to GroupHistoryWriter

Authored by tobias_ruck on Feb 16 2024, 11:51.
This is a draft revision that has not yet been submitted for review.


Group Reviewers
Restricted Project

In benchmarks, GroupHistoryWriter is still the slowest part of Chronik. The bottleneck, surprisingly, is still figuring out how many txs a given script has. We added a column family for the number of txs of a script, but it's not enough, because it fails us for the most common case: No script history at all.

This diff tackles exactly this common case, by using a bloom filter to store whether a script has any tx history at all yet. Since a bloom filter cannot have false negatives, if it returns that a script is not in the bloom filter, we know for sure that its number of txs is 0, and don't even have to read from the DB.

In benchmarks, this significantly speeds up indexing; especially if combined with an in-memory LRU cache for the number of txs in the script history, and makes GroupHistoryWriter one of the fastest parts of the indexer.

Test Plan

ninja check-functional