HomePhabricator

Faster -reindex by initially deserializing only headers

Description

Faster -reindex by initially deserializing only headers

Summary:
When a block is initially read from a blk*.dat file during reindexing,
it can be added to the block index only if all of its ancestor blocks
have been added, which is rare. If the block's ancestors have not been
added, the block must be re-read from disk later when it can be added.

This commit: During the initial block read, deserialize only its header,
rather than the entire block, since this is sufficient to determine
if its parent (and thus all its ancestors) has been added. This is a
performance improvement.

Benchmark (to be compared to D15959):

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|       11,230,665.00 |               89.04 |    1.1% |      0.13 | `LoadExternalBlockFile`

More benchmark data from the PR description:

This reduces reindex time on mainnet by 7 hours on a Raspberry Pi, which translates to around a 25% reduction in the first part of reindexing (adding blocks to the index), and about a 6% reduction in overall reindex time.

This concludes backport of core#16981
https://github.com/bitcoin/bitcoin/pull/16981/commits/db929893ef0bc86ea2708cdbcf41152240cd7c73
Depends on D15960

Test Plan: ninja all check-all

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D15961

Details

Provenance
Larry Ruane <larryruane@gmail.com>Authored on Aug 7 2020, 21:07
PiRKCommitted on Fri, Apr 12, 14:28
PiRKPushed on Fri, Apr 12, 14:28
Reviewer
Restricted Project
Differential Revision
D15961: Faster -reindex by initially deserializing only headers
Parents
rABCb96630370588: util: add CBufferedFile::SkipTo() to move ahead in the stream
Branches
Unknown
Tags
Unknown