validation: write chainstate to disk every hour
909eaae6d6c6
Actions

Description

validation: write chainstate to disk every hour

Summary:
Remove the 24 hour periodic flush interval and write the chainstate along with blocks and block index every hour.

From the PR description:

Since #28233, periodically writing the chainstate to disk every 24 hours does not clear the dbcache. Since #28280, periodically writing the chainstate to disk is proportional only to the amount of dirty entries in the cache. Due to these changes, it is no longer beneficial to only write the chainstate to disk every 24 hours. The periodic flush interval was necessary because every write of the chainstate would clear the dbcache. Now, we can get rid of the periodic flush interval and simply write the chainstate along with blocks and block index at least every hour.

Three benefits of doing this:

For IBD or reindex-chainstate with a combination of large dbcache setting, slow CPU, slow internet speed/unreliable peers, it could be up to 24 hours until the chainstate is persisted to disk. A power outage or crash could potentially lose up to 24 hours of progress. If there is a very large amount of dirty cache entries, writing to disk when a flush finally does occur will take a very long time. Crashing during this window of writing can cause "Rolling forward" at startup can take a long time, and is not interruptible (after unclean shutdown) #11600. By syncing every hour in unison with the block index we avoid this problem. Only a maximum of one hour of progress can be lost, and the window for crashing during writing is much smaller. For IBD with lower dbcache settings, faster CPU, or better internet speed/reliable peers, chainstate writes are already triggered more often than every hour so this change will have no effect on IBD.

Based on discussion in "Don't empty dbcache on prune flushes: >30% faster IBD #28280", writing only once every 24 hours during long running operation of a node causes IO spikes. Writing smaller chainstate changes every hour like we do with blocks and block index will reduce IO spikes.

Faster shutdown speeds. All dirty chainstate entries must be persisted to disk on shutdown. If we have a lot of dirty entries, such as when close to 24 hours or if we sync with a large dbcache, it can take a long time to shutdown. By keeping the chainstate clean we avoid this problem.

A tradeoff is that now we write slightly more data to the disk that will be deleted again an hour later, as utxos are often short-lived.

By writing every hour, we write about 50% more data, but we are only spending 20% more time writing.
On master we write about 42 MiB every 24 hours which locks the main thread for 2.3 seconds.
On this branch we write about 2.7 MiB every hour which locks the main thread for 115 milliseconds.
So it's about 0.9 MiB more data written every hour, which I think is a negligible amount. Note that this extra data is ephemeral and will be erased as these extra utxos are spent. It will not cause nodes to have to store more data.

This is a partial backport of core#30611
https://github.com/bitcoin/bitcoin/pull/30611/commits/d73bd9fbe483ad1397f62dc1d580314202351ace
Depends on D18639

Test Plan: ninja all check-all

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D18640