Page MenuHomePhabricator

Added a script to automate collecting and updating seeds
Needs RevisionPublic

Authored by jasonbcox on Jun 11 2019, 22:59.

Details

Reviewers
deadalnix
Fabien
Group Reviewers
Restricted Project
Summary

Updating seeds has been a pain point for releases for some time now.
In order to contribute to updating seeds reliably, a contributor must have
access to both mainnet and testnet seeders. This poses the following challenges:

  1. Contributors must be either:

    a) A sysadmin with the skills to setup their own seeder(s), or

    b) Given SSH access to a server by a sysadmin with the skills to setup their own seeder(s) and provide the appropriate security for a multi-user environment.
  2. The seeder(s) must be running for 2-4 weeks BEFORE updating seeds, making it difficult to pull in new contributors.

Adding this new script decouples the above problems from the process of updating the
seeds by providing a way for any contributor to update seeds without themselves needing
to setup seeders. This leaves 1.b for which I have setup seederdump.bitframe.org and
will provide (in the future) a script and setup instructions to easily replicate for
other server hosts for additional redundancy.

Test Plan
cd contrib/seeds

# Fails with help message
./collect-and-update-seeds.sh

# Help message
./collect-and-update-seeds.sh -h
./collect-and-update-seeds.sh --help

# Works as expected
./collect-and-update-seeds.sh seederdump.bitframe.org

# Running a second time overwrites seeds_*.txt as expected
./collect-and-update-seeds.sh seederdump.bitframe.org

# Running from top-level (or some other directory) works as expected
cd ../..
./collect-and-update-seeds.sh seederdump.bitframe.org

# Simulate too-few nodes failure:
# 1. Comment out the `curl` line in collect-and-update-seeds.sh
# 2. Open seeds_main.txt and remove all lines after line 99
./collect-and-update-seeds.sh seederdump.bitframe.org
# Skips mainnet seeds due to too few good nodes, but still completes

# 1. Comment out the `curl` line in collect-and-update-seeds.sh
# 2. Open seeds_test.txt and remove all lines after line 9
./collect-and-update-seeds.sh seederdump.bitframe.org
# Skips testnet seeds due to too few good nodes, but still completes

# Simulate patch-mismatch failure:
# 1. Comment out the `makeseeds` line in collect-and-update-seeds.sh
# 2. Open seeds_main.txt and/or seeds_test.txt and add a newline anywhere in the file
./collect-and-update-seeds.sh seederdump.bitframe.org
# Patch sanity check fails due to diff line mismatch

Diff Detail

Repository
rABC Bitcoin ABC
Branch
collect-seeds
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 6462
Build 10971: Bitcoin ABC Teamcity Staging
Build 10970: arc lint + arc unit

Event Timeline

jasonbcox created this revision.Jun 11 2019, 22:59
jasonbcox planned changes to this revision.Jun 11 2019, 23:00

Still need to update docs, but looking for a concept ACK before continuing.

jasonbcox edited the summary of this revision. (Show Details)Jun 11 2019, 23:08
jasonbcox updated this revision to Diff 9461.Jun 14 2019, 22:00
  • Added sanity checks at various steps in the seed updating process
  • Added basic error checking
  • Added ability to skip one set of seeds if it fails

And a note about later improvements: Fabien and I talked about improving the
robustness of this script by pulling seeds from multiple seeders and merging
them. While this is still a good idea, I don't think it's necessary for MVP.

jasonbcox edited the test plan for this revision. (Show Details)Jun 14 2019, 22:04
Fabien requested changes to this revision.Jun 24 2019, 07:54
Fabien added a subscriber: Fabien.
Fabien added inline comments.
contrib/seeds/collect-and-update-seeds.sh
33 ↗(On Diff #9461)

After the call to makeseeds.py you can remove the $SEEDS_INPUT files.

40 ↗(On Diff #9461)

git checkout HEAD -- ":/contrib/seeds/nodes_${1}.txt"

44 ↗(On Diff #9461)

The generate-seeds.py script expects both file to be here, or fails. If you can't get both you should bail, or eventually create an empty nodes_xxx.txt.

61 ↗(On Diff #9461)

Dito, use -- before the path for all the git checkout calls.

This revision now requires changes to proceed.Jun 24 2019, 07:54
jasonbcox added inline comments.Jun 24 2019, 18:44
contrib/seeds/collect-and-update-seeds.sh
44 ↗(On Diff #9461)

The nodes_* files are tracked by source control, so they should always be there.

jasonbcox updated this revision to Diff 9626.Jun 24 2019, 18:45
  • Fix according to feedback
  • Refactored help message
  • Added support for -h and --help
jasonbcox edited the test plan for this revision. (Show Details)Jun 24 2019, 18:46
Fabien added inline comments.Jun 25 2019, 06:44
contrib/seeds/collect-and-update-seeds.sh
62

Optimization: you can replace grep [...] | wc -l with grep -c [...]

deadalnix requested changes to this revision.Jun 26 2019, 00:57

The whole thing doesn't actually test what we are interested in: that the node effectively have a good list of node to bootstrap from.

contrib/seeds/.gitignore
1

Scripts shouldn't dump crap in tree.

contrib/seeds/collect-and-update-seeds.sh
39

You return and yet never check the return value.

40

So if there isn't, we go for several weeks without update? That doesn't sound optimal.

43

This can fail.

58

I'm pretty this could fail.

65

This rely on unchecked assumption - that rows comes in canonical order, and that the diff tool will chose to diff in a certain way. This is at best fragile and doesn't reall test what we want to test: that we have seed to bootstrap from.

74

git reset --hard

But arguably, this is not up to this patch to decide what to cleanup. If things are fubared such as the script fails, then we shouldn't expect the script to work properly, which is an assumption you make by giving it the responsibility of cleaning up.

This revision now requires changes to proceed.Jun 26 2019, 00:57