Page MenuHomePhabricator

[explorer] Show the miner name in the block page
ClosedPublic

Authored by bytesofman on Jul 25 2024, 20:09.

Details

Reviewers
tobias_ruck
Group Reviewers
Restricted Project
Commits
rABCe228ff422832: [explorer] Show the miner name in the block page
Summary

T3543

Parse coinbase string for miner. Display on block page.

Test Plan
cd explorer-exe
cargo run

check rendering for a few blocks

cd explorer-server
cargo test

tests pass

Diff Detail

Repository
rABC Bitcoin ABC
Branch
explorer-miner-name
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 29777
Build 59089: Build Diff
Build 59088: arc lint + arc unit

Event Timeline

minimal implementation with tests

Add unit tests to cover all miners covered by ecash-herald

bytesofman added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
123 ↗(On Diff #48871)

It's probably overkill to also check the coinbase address in the explorer. In practice, it is rare to see a miner that cannot be identified by the hex string. Probably not worth the additional checks required on every single block just to get < 1% of miners by their coinbase address.

434 ↗(On Diff #48871)

Test cases are from ecash-herald tests

Going forward, when we have a new miner, diff should update both ecash-herald and here. There is some syntax difference but otherwise the implementations are quite similar, easy to maintain modularly.

Perhaps there is some way to pull test mocks from one place in the monorepo, but I am not sure how to do this. I do not see it as a major inefficiency to keep the rust tests with rust file, JS tests with JS files, given the shims that might be neccessary to make these repos work together, tho there is always the possibility of divergence in supported miners.

web/explorer/explorer-server/templates/pages/block.html
112 ↗(On Diff #48871)

I'm not sure what "safe" means here, syntax is copied from other instances of render function calls in template

Fabien added inline comments.
web/explorer/explorer-server/templates/pages/block.html
112 ↗(On Diff #48871)

safe means that the string will not be escaped by the template engine: https://djc.github.io/askama/filters.html#safe

Since you are always setting the value from the filter (trusted source) this is fine here.

PiRK added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
245–322 ↗(On Diff #48871)

Could be refactored

something like

let matches = [
    ("736f6c6f706f6f6c2e6f7267", "solopool.org"),
    ("7370622e78797a", "p2p-spb"),
];

for &(str_to_match, str_to_show) in &matches {
    if coinbase_string.contains(str_to_match) {
        let output = html! {
            str_to_show
        };
        return Ok(output.into_string());
    };
}
434 ↗(On Diff #48871)

Having to maintain two lists of miners in two different codebases is a bit annoying, but I'm not sure how we could keep the miner parsing in a single place and make it work for both js and rust code. I can think of a few ideas, but it would all take some work and we are not getting new miners very often, so not sure if the extra effort would pay off vs just maintaining the logic in two places in the foreseeable future.

tobias_ruck added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
128 ↗(On Diff #48871)

this says Mining-Dutch in hex?

129 ↗(On Diff #48871)

I think all this hex makes it very hard to read / maintain this code (and there's already an issue in the comment above). Instead, I think it makes more sense for each miner to use a regex, like this: https://docs.rs/regex/latest/regex/bytes/struct.Regex.html#method.captures

Since each miner has a relatively simple condition, this seems like a better approach.

For example, the regex for ViaBTC could be (not tested as I don't have an example coinbase):

use regex::bytes::Regex;

let reg_viabtc = Regex::new("ViaBTC(.*/Mined by (?P<mined_by>\w+)/)?")?;
if let Some(captures) = reg_viabtc.captures(coinbase_data) {
   if let Some(mined_by) = captures.name("mined_by") {
      return Ok(format!("ViaBTC | Mined by {}", String::from_utf_lossy(mined_by.as_bytes())));
   }
   return Ok("ViaBTC".to_string());
}
web/explorer/explorer-server/templates/pages/block.html
112 ↗(On Diff #48871)

safe means that it's HTML safe, but since miner data (e.g. for ViaBTC) could be anything this likely could be used inject HTML; the safe should be removed

This revision now requires changes to proceed.Jul 29 2024, 09:10
bytesofman added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
129 ↗(On Diff #48871)

yes, this is much better. I didn't think this would work since coinbase data can sometimes parse with crazy stuff in utf8 (line breaks, unknown characters, etc).

In testing, it works.

245–322 ↗(On Diff #48871)

nice, much cleaner approach thanks

bytesofman marked 2 inline comments as done.

use regex instead of raw hex, use for loop instead of distinct if blocks for pure string miners

improve maintainability with miner categories

web/explorer/explorer-server/src/templating/filters.rs
301 ↗(On Diff #48880)

could keep these vectors in an array. but, since they are long and must be manually formatted to meet line break requirements, I think it is easier to maintain / read in this "one at a time" format.

tobias_ruck added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
126 ↗(On Diff #48880)

Much better!

Something we should do (maybe this diff) is to only build the regex once; Rust 1.80 just added LazyCell/LazyLock to do just that.

static reg_viabtc = LazyLock::new(|| Regex::new(r"ViaBTC(.*/Mined by (?P<mined_by>\w+)/)?").unwrap());

We'd have to upgrade the Rust version we're using to 1.80 first though (which should be trivial unless they added new lints we're not aware of).

Lmk if that makes sense

168 ↗(On Diff #48880)

For a trivial case like this, it's a bit overkill to build a regex each time (although it's probably still quite performant); probably better to add a simple function like this to find a substring in a byteslice (see here):

fn contains_subslice(haystack: &[u8], needle: &[u8]) -> bool {
    haystack.windows(needle.len()).any(|window| window == needle)
}

You can use str_to_match.as_bytes() to get its byte slice.

186–189 ↗(On Diff #48880)
303–307 ↗(On Diff #48880)

I think this is one of the rare cases where having a byte literal + lots of escapes is the cleanest approach. What do you think? This way it's immediately clear what the parsed output should be. Similar for the other coinbase strings.

web/explorer/explorer-server/templates/pages/block.html
112 ↗(On Diff #48880)

I just realized we just hard coded "Unknown" here since forever, incredible :D

I recall writing the explorer in a shitty apartment in Garapan, Saipan, hope that explains it

This revision now requires changes to proceed.Jul 29 2024, 11:46
web/explorer/explorer-server/src/templating/filters.rs
129 ↗(On Diff #48871)

There's regex::bytes::Regex for exactly this, regex::Regex indeed wouldn't work.

bytesofman marked 4 inline comments as done.

minimize regex, add tests for new fn

web/explorer/explorer-server/src/templating/filters.rs
126 ↗(On Diff #48880)

This does make sense but I am not sure what we need to do to bump the rust version here. Sounds like that should be its own diff? Could happen before this lands?

Would we need to bump rust for the whole monorepo or just the explorer?

After implementing the byte substring finding described below -- I moved these regex methods to under those methods. So we are only making regex if the miner is not one of the exact matches.

303–307 ↗(On Diff #48880)

The byte literal + lots of escapes approach allows more specific and legible unit testing. However the function will only ever be used on real coinbase data, so I think it's appropriate to use these in the mocks ... and I'm not sure how to easily construct these as byte literal + lots of escapes (at least, not easier than copy pasting the coinbase for the block we want to know we can parse).

web/explorer/explorer-server/src/templating/filters.rs
126 ↗(On Diff #48880)

A bump to 1.79 was attempted recently (D16361) and was not successful (it was breaking the windows build).
I think this should be this own diff if needed. I would advise against doing it in that diff or make that diff depend on the new feature because it can be more difficult that you think to change the version.

web/explorer/explorer-server/src/templating/filters.rs
303–307 ↗(On Diff #48880)

I'm not sure what you mean—this just changes how the code has the test data written; my suggestion is that instead of having it in hex to have it as byte string literal instead. The way I converted them was via python:

>>>bytes.fromhex('0378120c182f5669614254432f4d696e6564206279203236303738362f103b6fa20ff3648a69acc31ed9b4946c00')

b'\x03x\x12\x0c\x18/ViaBTC/Mined by 260786/\x10;o\xa2\x0f\xf3d\x8ai\xac\xc3\x1e\xd9\xb4\x94l\x00'
tobias_ruck added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
298–299 ↗(On Diff #48889)

This works, but IMO best practice (and what I use in Chronik) is to use normal import, it makes it clearer where stuff is coming from (since the super module might import stuff we don't want to have here).

321–325 ↗(On Diff #48889)

I still recommend the byte slice literal for the coinbase data, you can use python to get it:

>>>bytes.fromhex('0378120c182f5669614254432f4d696e6564206279203236303738362f103b6fa20ff3648a69acc31ed9b4946c00')

b'\x03x\x12\x0c\x18/ViaBTC/Mined by 260786/\x10;o\xa2\x0f\xf3d\x8ai\xac\xc3\x1e\xd9\xb4\x94l\x00'
This revision now requires changes to proceed.Jul 30 2024, 10:27
bytesofman added inline comments.
web/explorer/explorer-server/src/templating/filters.rs
298–299 ↗(On Diff #48889)

ok cool. I don't have any insight or preference to what is here, just what I happened to find first that led to passing tests.

bytesofman marked an inline comment as done.

remove super import, use bytes for mocks in render miner tests

This revision is now accepted and ready to land.Jul 30 2024, 11:58