Page MenuHomePhabricator

Added OP_REVERSEBYTES+implementation, added (always disabled) activation flag
ClosedPublic

Authored by tobias_ruck on Jan 8 2020, 19:19.

Details

Summary

Implementation of OP_REVERSEBYTES is very straightforward. T707.

Previously named OP_REVERSE, then OP_BSWAP, then OP_ENDIAN_REVERSE.

Test Plan

Diff Detail

Repository
rABC Bitcoin ABC
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
deadalnix requested changes to this revision.Jan 11 2020, 17:26
deadalnix added inline comments.
src/test/data/script_tests.json
964 ↗(On Diff #15288)

You are missing the empty stack case.

src/test/op_endian_reverse_tests.cpp
103 ↗(On Diff #15288)

You can use emplace back.

114 ↗(On Diff #15288)

Instead of having all this setup, you'd be better off having some template that takes 2 argument, a stack element and its reverse and check inversion and double inversion on these. Then you can essentially express that part of the test as test vectors. You can have generated test vectors as well.

This revision now requires changes to proceed.Jan 11 2020, 17:26

Renamed to OP_ENDIAN_REVERSE, reasons can be found in the PR:

In a previous proposal, this opcode has been named OP_REVERSE. After that, it has been renamed to OP_BSWAP, as that is a more technically accurate term, which is commonly used for reversing the byteorder of integers [14] [15]. However, after some more consideration, it has been renamed to OP_ENDIAN_REVERSE following Boost‘s nomenclature [16]. This is because OP_BSWAP is lexically very similar to the already existing OP_SWAP and would make Script harder to read. Also, while the technical term for the instruction is indeed bswap, it isn‘t well known for developers of higher level languages and could thus spark confusion that would be avoided by using the name OP_ENDIAN_REVERSE, which is more self-descriptive.

Also added tests for unexecuted branches, clarified tests for not yet activated case.

Merged D4859 into this diff.

I'm not a fan of the new name... "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes.... If you're worried about confusion, using the word "Endian" to me seems more confusing. It's possible people may use for things other than endian conversion, since it can reverse the byte order for any arbitrary-length string.

How about "OP_REVERSE_BYTES", or "OP_BYTE_SWAP"?

In any case, I guess this is a bit bike-sheddy, so not super-important either way.

Renamed to OP_ENDIAN_REVERSE, reasons can be found in the PR:

In a previous proposal, this opcode has been named OP_REVERSE. After that, it has been renamed to OP_BSWAP, as that is a more technically accurate term, which is commonly used for reversing the byteorder of integers [14] [15]. However, after some more consideration, it has been renamed to OP_ENDIAN_REVERSE following Boost‘s nomenclature [16]. This is because OP_BSWAP is lexically very similar to the already existing OP_SWAP and would make Script harder to read. Also, while the technical term for the instruction is indeed bswap, it isn‘t well known for developers of higher level languages and could thus spark confusion that would be avoided by using the name OP_ENDIAN_REVERSE, which is more self-descriptive.

Also added tests for unexecuted branches, clarified tests for not yet activated case.

Merged D4859 into this diff.

I'm not a fan of the new name... "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes.... If you're worried about confusion, using the word "Endian" to me seems more confusing. It's possible people may use for things other than endian conversion, since it can reverse the byte order for any arbitrary-length string.

How about "OP_REVERSE_BYTES", or "OP_BYTE_SWAP"?

In any case, I guess this is a bit bike-sheddy, so not super-important either way.

https://github.com/bitcoincashorg/bitcoincash.org/pull/358

As BigBlockIfTrue pointed out, we should use a name that follows the same scheme existing opcodes use. I think OP_REVERSEBYTES might be the best choice here. Your argument about endianess makes sense

markblundeberg added a comment.EditedJan 12 2020, 02:12

I'm not a fan of the new name... "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes....

Yeah the same thing was bugging me, but I couldn't quite explain what it was. (Also remember that technically there are things besides big-endian and little-endian. ;-P )

OP_REVERSEBYTES does sound nice.

tobias_ruck added inline comments.Jan 14 2020, 16:16
src/test/data/script_tests.json
964 ↗(On Diff #15288)

well spotted.

src/test/op_endian_reverse_tests.cpp
103 ↗(On Diff #15288)

Why? Should I use some variadic generics magic here?

tobias_ruck updated this revision to Diff 15459.Jan 14 2020, 17:31

Renamed opcode to OP_REVERSEBYTES (which is the final name now), refactored & amended tests.

Snippet of first build failure:

[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git config core.sparseCheckout true
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git config http.sslCAInfo
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git show-ref
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git show-ref refs/tags/phabricator/diff/15459
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git log -n1 --pretty=format:%H%x20%s 9dc27d55a40e7fe6acb5ff7b28bdb6d54423860b --
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git -c credential.helper= checkout -q -f phabricator/diff/15459
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git show-ref refs/tags/phabricator/diff/15459
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] Cleaning Bitcoin ABC Staging in /home/teamcity/buildAgent/work/c4a5708f2bae7929 the file set ALL_UNTRACKED
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git clean -f -d -x
[17:32:02] : Build preparation done
[17:32:02]E: Step 1/1: Command Line
[17:32:02] :	 [Step 1/1] Ant JUnit report watcher
[17:32:02] :		 [Ant JUnit report watcher] Watching paths:
[17:32:02] :		 [Ant JUnit report watcher] +:build/test_bitcoin.xml
[17:32:02] :		 [Ant JUnit report watcher] +:test/functional/junit_results.xml
[17:32:02] :		 [Ant JUnit report watcher] +:build/junit_results*.xml
[17:32:02] :	 [Step 1/1] Starting: /home/teamcity/buildAgent/temp/agentTmp/custom_script135283986362505923
[17:32:02] :	 [Step 1/1] in directory: /home/teamcity/buildAgent/work/c4a5708f2bae7929
[17:32:02] :	 [Step 1/1] Running build configuration 'build-diff'...
[17:32:02]W:	 [Step 1/1] + : build-diff
[17:32:02]W:	 [Step 1/1] + '[' -z build-diff ']'
[17:32:02]W:	 [Step 1/1] + echo 'Running build configuration '\''build-diff'\''...'
[17:32:02]W:	 [Step 1/1] ++ git rev-parse --show-toplevel
[17:32:02]W:	 [Step 1/1] + TOPLEVEL=/home/teamcity/buildAgent/work/c4a5708f2bae7929
[17:32:02]W:	 [Step 1/1] + export TOPLEVEL
[17:32:02]W:	 [Step 1/1] + trap print_sanitizers_log ERR
[17:32:02]W:	 [Step 1/1] +++ dirname ./contrib/teamcity/build-configurations.sh
[17:32:02]W:	 [Step 1/1] ++ cd ./contrib/teamcity
[17:32:02]W:	 [Step 1/1] ++ pwd
[17:32:02]W:	 [Step 1/1] + CI_SCRIPTS_DIR=/home/teamcity/buildAgent/work/c4a5708f2bae7929/contrib/teamcity
[17:32:02]W:	 [Step 1/1] + setup
[17:32:02]W:	 [Step 1/1] + : /home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] + mkdir -p /home/teamcity/buildAgent/work/c4a5708f2bae7929/build/output
[17:32:02]W:	 [Step 1/1] ++ cd /home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] ++ pwd
[17:32:02]W:	 [Step 1/1] + BUILD_DIR=/home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] + export BUILD_DIR
[17:32:02]W:	 [Step 1/1] + TEST_RUNNER_FLAGS=--tmpdirprefix=output
[17:32:02]W:	 [Step 1/1] + cd /home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] ++ nproc
[17:32:02]W:	 [Step 1/1] + THREADS=4
[17:32:02]W:	 [Step 1/1] + export THREADS
[17:32:02]W:	 [Step 1/1] + SAN_SUPP_DIR=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions
[17:32:02]W:	 [Step 1/1] + SAN_LOG_DIR=/tmp/sanitizer_logs
[17:32:02]W:	 [Step 1/1] + mkdir -p /tmp/sanitizer_logs
[17:32:02]W:	 [Step 1/1] + rm -rf '/tmp/sanitizer_logs/*'
[17:32:02]W:	 [Step 1/1] + export ASAN_OPTIONS=malloc_context_size=0:log_path=/tmp/sanitizer_logs/asan.log
[17:32:02]W:	 [Step 1/1] + ASAN_OPTIONS=malloc_context_size=0:log_path=/tmp/sanitizer_logs/asan.log
[17:32:02]W:	 [Step 1/1] + export LSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/lsan:log_path=/tmp/sanitizer_logs/lsan.log
[17:32:02]W:	 [Step 1/1] + LSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/lsan:log_path=/tmp/sanitizer_logs/lsan.log
[17:32:02]W:	 [Step 1/1] + export TSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/tsan:log_path=/tmp/sanitizer_logs/tsan.log
[17:32:02]W:	 [Step 1/1] + TSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/tsan:log_path=/tmp/sanitizer_logs/tsan.log
[17:32:02]W:	 [Step 1/1] + export UBSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:log_path=/tmp/sanitizer_logs/ubsan.log
[17:32:02]W:	 [Step 1/1] + UBSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:log_path=/tmp/sanitizer_logs/ubsan.log
[17:32:02]W:	 [Step 1/1] + case "$ABC_BUILD_NAME" in
[17:32:02] :	 [Step 1/1] Error: Invalid build name 'build-diff'
[17:32:02]W:	 [Step 1/1] + echo 'Error: Invalid build name '\''build-diff'\'''
[17:32:02]W:	 [Step 1/1] + exit 2
[17:32:02]W:	 [Step 1/1] Process exited with code 2
[17:32:02]E:	 [Step 1/1] Process exited with code 2 (Step: Command Line)
tobias_ruck added inline comments.Jan 14 2020, 17:33
src/test/op_reversebytes_tests.cpp
112 ↗(On Diff #15459)

Can someone confirm that this is not UB? When will data be freed?

Snippet of first build failure:

[17:32:02]W:	 [Step 1/1] + case "$ABC_BUILD_NAME" in
[17:32:02] :	 [Step 1/1] Error: Invalid build name 'build-diff'
[17:32:02]W:	 [Step 1/1] + echo 'Error: Invalid build name '\''build-diff'\'''
[17:32:02]W:	 [Step 1/1] + exit 2
[17:32:02]W:	 [Step 1/1] Process exited with code 2
[17:32:02]E:	 [Step 1/1] Process exited with code 2 (Step: Command Line)

This seems like a configuration error. Does someone know what‘s going on there?

Snippet of first build failure:

[17:32:02]W:	 [Step 1/1] + case "$ABC_BUILD_NAME" in
[17:32:02] :	 [Step 1/1] Error: Invalid build name 'build-diff'
[17:32:02]W:	 [Step 1/1] + echo 'Error: Invalid build name '\''build-diff'\'''
[17:32:02]W:	 [Step 1/1] + exit 2
[17:32:02]W:	 [Step 1/1] Process exited with code 2
[17:32:02]E:	 [Step 1/1] Process exited with code 2 (Step: Command Line)

This seems like a configuration error. Does someone know what‘s going on there?

Hey, I think @jasonbcox tweaked something in the CI, so pull the latest master and rebase your commit on that.

tobias_ruck updated this revision to Diff 15464.Jan 14 2020, 20:39

Rebased onto the changes from master.

That fixed it. Thanks Mark!

Tiny quibble but otherwise good.

src/test/data/script_tests.json
1278 ↗(On Diff #15464)

Hmm, actually the prior versions of this test and the one below still pass just fine, right? Maybe we can just leave them unmodified for now, that is the best proof of not changing behaviour. I like that these ones explicitly mention 0xbc.

tobias_ruck updated this revision to Diff 15658.Jan 18 2020, 19:07

Fixed the, and I quote, quibble, which has been pointed out by @markblundeberg

deadalnix requested changes to this revision.Jan 19 2020, 17:31

A few minor improvement, but it's getting there. You should consider starting to work on activation logic and tests.

src/script/interpreter.cpp
1250 ↗(On Diff #15464)

I'd remove that empty line, but it's not that important.

src/test/data/script_tests.json
1278 ↗(On Diff #15464)

This would benefit from only updating the comment to reflect the new state of the code, without changing the tests, indeed.

src/test/op_reversebytes_tests.cpp
88 ↗(On Diff #15464)

likestamp

101 ↗(On Diff #15464)

likestamp

104 ↗(On Diff #15464)

You probably want to use size_t here

109 ↗(On Diff #15464)

emplace_back and you won't need to cast

112 ↗(On Diff #15464)

Note that you are making a copy of reversed_data here and then immediately destroying it. This is not a huge deal, but not doing so is actually less code, so I'm wondering what's up.

test_cases.push_back({data, {data.rbegin(), data.rend()}});
116 ↗(On Diff #15464)

Why not use the same outer loop as the previous one?

121 ↗(On Diff #15464)

dito

136 ↗(On Diff #15464)
for(const ReverseTestCase&test_case : test_cases) { ... }
143 ↗(On Diff #15464)

You probably want to run this guy over a few palindromes to check edge cases. 0, odd, even and maximum size come to mind, and do it in the flag loop.

This revision now requires changes to proceed.Jan 19 2020, 17:31

@tobias_ruck Could you change the name to OP_REVERSEBYTES in the Diff title?

(Also, it's not really necessary to mention tests in the title..... it should be a given that every Diff contains the appropriate tests)

tobias_ruck updated this revision to Diff 15714.Jan 21 2020, 14:42
tobias_ruck marked an inline comment as done.

Removed empty line in interpreter.cpp. Fused loops in op_reversebytes_test.cpp and other refactorings, expanded palindromes test vector.

tobias_ruck retitled this revision from Added OP_ENDIAN_REVERSE+implementation, added (always disabled) activation flag, added tests for OP_ENDIAN_REVERSE. to Added OP_REVERSEBYTES+implementation, added (always disabled) activation flag.Jan 21 2020, 14:47
tobias_ruck edited the summary of this revision. (Show Details)
tobias_ruck edited the test plan for this revision. (Show Details)
tobias_ruck updated this revision to Diff 15716.Jan 21 2020, 14:53

Updated comments

tobias_ruck added inline comments.Jan 21 2020, 15:19
src/test/op_reversebytes_tests.cpp
112 ↗(On Diff #15464)

what‘s going on it that I‘m used to Rust‘s move semantics ;)
thanks for clarifying that

143 ↗(On Diff #15464)

Go Hang a Salami. I'm a Lasagna Hog!

tobias_ruck updated this revision to Diff 15945.Feb 2 2020, 18:42

Rebased onto master

deadalnix requested changes to this revision.Feb 5 2020, 00:11

One small change, but I think this is pretty much good to go.

src/test/data/script_tests.json
1279 ↗(On Diff #15945)

You need to move the comment about FIRST_UNDEFINED_OP_VALUE here.

This revision now requires changes to proceed.Feb 5 2020, 00:11
tobias_ruck updated this revision to Diff 16029.Feb 5 2020, 14:16

Moved the comment about FIRST_UNDEFINED_OP_VALUE.

deadalnix accepted this revision.Feb 6 2020, 00:00
This revision is now accepted and ready to land.Feb 6 2020, 00:00
tobias_ruck edited the test plan for this revision. (Show Details)Feb 18 2020, 18:23