Page MenuHomePhabricator

Fix segfault when node no longer connected
ClosedPublic

Authored by freetrader on Jul 13 2017, 23:07.

Details

Summary

nodestate can be NULL when node is no longer connected,
leading to a segmentation fault (refer T64).

Added assertions to other places where node state was
obtained and later used without checking if non-null.

Test Plan

make check

Diff Detail

Repository
rABC Bitcoin ABC
Branch
fix_segfault_on_bad_nodestate
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 500
Build 500: arc lint + arc unit

Event Timeline

Herald added 1 blocking reviewer(s): Restricted Project. · View Herald TranscriptJul 13 2017, 23:07
src/net_processing.cpp
450

I have omitted the check done in Core for 'nodestate->fSupportsDesiredCmpctVersion' since it's not clear to me that ABC should

// Never ask from peers who can't provide witnesses.

Open to discussion.

deadalnix requested changes to this revision.Jul 13 2017, 23:32
deadalnix added inline comments.
src/net_processing.cpp
1876

nullptr

2224

dito

2488

dito

2556

dito

This revision now requires changes to proceed.Jul 13 2017, 23:32
src/net_processing.cpp
1876

OK, in this case I'm going to move to using nullptr instead of NULL in the existing asserts in this file as well.

Man I feel like toxic LukeJr lately. All I can come up with are critical comments. Sorry ahead of time.

You guys know that in C++, the value '0' is a sort of universally castable type to any pointer? 'NULL' is a C-ism, and has a long history. I still see plenty of C++ programmers and code using NULL, or things like 'nullptr' needlessly.

It's ok to use 0. It's perfectly good C++.

You can just

assert(nodestate)

or if you want to be verbose:

assert(nodestate != 0)

But PLEASE .. stop using NULL. It just makes me feel like I'm back in 1989 programming in C or something. C++ has 0. 0 is always null. On every architecture. No need to use "NULL". It looks.. well, strange these days. Like we are a bunch of C programmers that just learned how to use this C++ thing because it's what the kids these days are doing...

Gladly moving to nullptr, I just didn't want to mix refactoring with semantic change here.

So please let's look at D325, so I can rebase this one on that one .

OK, aside from my bitching about the use of NULL.. the real thing is here the end result will be the same with these asserts, namely the program will EXIT (effectively crash).

We should recode all these methods to at least return gracefully with some error or something if notestate is NULL. Not exit the program!

asserts() really just are about as bad as dereferencing nodestate, obviously, when it comes to reliability of the client.

No disagreement there (about the asserts not doing much for reliability of the software).

I just consider this a hotfix - we should look at the big picture and rewrite the methods doing this, but that will take much longer and users are experiencing the issue.

dagurval added inline comments.
src/net_processing.cpp
450

fSupportsDesiredCmpctVersion can be removed from the codebase entirely since we only support one version (version == 1)

Nit: This would be easier to read if we instead just did

if (!nodestate || !nodestate->fProvidesHeaderAndIDs) {
    return;
}

so we didn't have to look for further use of nodestate after the if statement

src/net_processing.cpp
450

Thank you for the constructive feedback. That makes good sense.

freetrader edited edge metadata.

Add changes from review feedback

freetrader edited the summary of this revision. (Show Details)
freetrader marked 5 inline comments as done.
freetrader added inline comments.
src/net_processing.cpp
1876

The general move to nullptr in other places in this file will be addressed in a subsequent Diff.

src/net_processing.cpp
1876 ↗(On Diff #809)

What makes you think this must be true ?

src/net_processing.cpp
1876 ↗(On Diff #809)

If you mean why nodestate should not be null, then it's because of the dereference on line 1912:

nodestate->pindexBestHeaderSent = pindex ? pindex : chainActive.Tip();

src/net_processing.cpp
1876 ↗(On Diff #809)

There are various control flow path in between the 2, some of them do not deref that pointer.

deadalnix requested changes to this revision.Jul 14 2017, 11:11

These asserts do not seem related to fixing the segfault. Fixing the segfault is hi pri, so there is no treason to bundle it with change on which there are still open questions.

This revision now requires changes to proceed.Jul 14 2017, 11:11
freetrader edited edge metadata.

Remove the asserts as requested by deadalnix

deadalnix added inline comments.
src/net_processing.cpp
448 ↗(On Diff #810)

On I think nodestate being null is not expected. Maybe logging something when it happen is a good idea.

This revision is now accepted and ready to land.Jul 14 2017, 11:43

Add a log message when nodestate is null, as requested

freetrader edited edge metadata.

Please have a look if you think the log message is ok.

src/net_processing.cpp
448 ↗(On Diff #810)

Added

deadalnix requested changes to this revision.Jul 14 2017, 12:15
deadalnix added inline comments.
src/net_processing.cpp
449 ↗(On Diff #811)

What the hell. `!nodestate->fProvidesHeaderAndIDs``` is not an unexpected state.

This revision now requires changes to proceed.Jul 14 2017, 12:15

Question: when is that lNodesAnnouncingHeaderAndIDs private list cleaned up if a node should disconnect early. I don't see it being cleaned up in FinalizeNode()...

Could this be a source of bad peer id's that then don't exist in the CNoteState map?

Just a thought...

src/net_processing.cpp
449 ↗(On Diff #811)

Fixed

This revision is now accepted and ready to land.Jul 14 2017, 12:25
This revision was automatically updated to reflect the committed changes.