Page MenuHomePhabricator

blockstorage: add return codes on flush functions, force callers to handle them
ClosedPublic

Authored by PiRK on Mon, Mar 24, 13:46.

Details

Summary

blockstorage: Mark FindBlockPos as nodiscard

A false return value indicates a fatal error (disk space being too low),
so make sure we always consume this error code.

This commit is part of an ongoing process for making the handling of
fatal errors more transparent and easier to understand.

blockstorage: Log on fatal block file flush error

By returning an error code if FlushBlockFile fails, the caller now has
to explicitly handle block file flushing errors. Before this change
such errors were non-explicitly ignored without a clear rationale.

Prior to this patch FlushBlockFile may have failed silently in
Chainstate::FlushStateToDisk. Improve this with a log line. Also add a
TODO comment to flesh out whether returning early in the case of an
error is appropriate or not. Returning early might be appropriate to
prohibit WriteBlockIndexDB from writing a block index entry that does
not refer to a fully flushed block.

Besides Chainstate::FlushStateToDisk, FlushBlockFile is also called
by FindBlockPos. Don't change the abort behavior there, since we don't
want to fail the function if the flushing of already written blocks
fails. Instead, just document it.

blockstorage: Log on fatal undo file flush error

By returning an error code if either FlushUndoFile or FlushBlockFile
fail, the caller now has to explicitly handle block undo file flushing
errors. Before this change such errors were non-explicitly ignored
without a clear rationale.

Besides the call to FlushUndoFile in FlushBlockFile, ignore its
return code at its call site in WriteUndoDataForBlock. There, a failed
flush of the undo data should not be indicative of a failed write.

Add nodiscard annotations to FlushUndoFile such that its return
value is not just ignored in the future.

This does not change the way flush errors are handled other than by adding log prints and comments to callsites. This makes ignoring the Flush errors more explicit, and will force potential future callsites to think about how to handle these errors.

This is a backport of core#27866

Test Plan

ninja all check-all

Diff Detail

Repository
rABC Bitcoin ABC
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

PiRK requested review of this revision.Mon, Mar 24, 13:46
Fabien requested changes to this revision.Mon, Mar 24, 14:31
Fabien added a subscriber: Fabien.

Does this diff makes sense since there is no notification for the flush error like core (yet) ? We call AbortNode immediately upon error.

src/node/blockstorage.cpp
557–559 ↗(On Diff #53241)
This revision now requires changes to proceed.Mon, Mar 24, 14:31
PiRK planned changes to this revision.Mon, Mar 24, 15:13

I think my summary (a copy paste of the initial message in the pull request) is misleading. The PR later change it's scope to make no change in behavior (besides additional logging) and basically just force future users of these functions to handle the return value.
I will fix this

Does this diff makes sense since there is no notification for the flush error like core (yet) ? We call AbortNode immediately upon error.

I don't think this makes a difference for the scope of this PR.

src/node/blockstorage.cpp
557–559 ↗(On Diff #53241)

OK but it will be reverted when we switch to the notifications system.

PiRK retitled this revision from blockstorage: Return on fatal flush errors to blockstorage: add return codes on flush functions, force callers to handle them.
PiRK edited the summary of this revision. (Show Details)

fix diff summary, implement review suggestion (return AbortNode....)

This revision is now accepted and ready to land.Mon, Mar 24, 16:12