Page MenuHomePhabricator

Added OP_REVERSEBYTES+implementation, added (always disabled) activation flag
Needs ReviewPublic

Authored by tobias_ruck on Wed, Jan 8, 19:19.

Details

Reviewers
deadalnix
markblundeberg
Fabien
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Restricted Project
Summary

Implementation of OP_REVERSEBYTES is very straightforward. T707.

Previously named OP_REVERSE, then OP_BSWAP, then OP_ENDIAN_REVERSE.

Test Plan

Diff Detail

Repository
rABC Bitcoin ABC
Branch
OP_REVERSEBYTES
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 9096
Build 16154: Bitcoin ABC Buildbot
Build 16153: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Coordinated SCRIPT_ENABLE_OP_BSWAP script flag bit with @markblundeberg

It looks like this depends on D4859 so make sure to note that in the summary as "Depends on D4859" -- phabricator should add a child-parent relationship.

markblundeberg requested changes to this revision.Thu, Jan 9, 00:43
markblundeberg added inline comments.
src/test/data/script_tests.json
257 ↗(On Diff #15251)

It makes sense to adjust these 0xbc test banks but you do still need: at least one test of BAD_OPCODE error when flag is off, and at least one test of unexecuted BSWAP when the flag is off, just to prove it is only invalid when executed.

964 ↗(On Diff #15251)

Hmm this test seems messed up -- do you want to test BAD_OPCODE or do you want to test empty stack?

This revision now requires changes to proceed.Thu, Jan 9, 00:43

It looks like this depends on D4859 so make sure to note that in the summary as "Depends on D4859" -- phabricator should add a child-parent relationship.

Or just merge both into one patch.

tobias_ruck updated this revision to Diff 15278.EditedThu, Jan 9, 16:41

Renamed to OP_ENDIAN_REVERSE, reasons can be found in the PR:

In a previous proposal, this opcode has been named OP_REVERSE. After that, it has been renamed to OP_BSWAP, as that is a more technically accurate term, which is commonly used for reversing the byteorder of integers [14] [15]. However, after some more consideration, it has been renamed to OP_ENDIAN_REVERSE following Boost‘s nomenclature [16]. This is because OP_BSWAP is lexically very similar to the already existing OP_SWAP and would make Script harder to read. Also, while the technical term for the instruction is indeed bswap, it isn‘t well known for developers of higher level languages and could thus spark confusion that would be avoided by using the name OP_ENDIAN_REVERSE, which is more self-descriptive.

Also added tests for unexecuted branches, clarified tests for not yet activated case.

Merged D4859 into this diff.

tobias_ruck retitled this revision from Implemented OP_BSWAP logic, added (always disabled) activation flag, added tests for OP_BSWAP. to Added OP_ENDIAN_REVERSE+implementation, added (always disabled) activation flag, added tests for OP_ENDIAN_REVERSE..Thu, Jan 9, 16:45
tobias_ruck edited the summary of this revision. (Show Details)
tobias_ruck edited the test plan for this revision. (Show Details)
tobias_ruck edited the summary of this revision. (Show Details)
jasonbcox added inline comments.
src/test/op_endian_reverse_tests.cpp
109 ↗(On Diff #15278)

This can be collapsed down to valtype reversed_data(data.rbegin(), data.rend());

tobias_ruck updated this revision to Diff 15288.Fri, Jan 10, 01:01

Collapsed reversed_data generation code.

deadalnix requested changes to this revision.Sat, Jan 11, 17:26
deadalnix added inline comments.
src/test/data/script_tests.json
964 ↗(On Diff #15288)

You are missing the empty stack case.

src/test/op_endian_reverse_tests.cpp
103 ↗(On Diff #15288)

You can use emplace back.

114 ↗(On Diff #15288)

Instead of having all this setup, you'd be better off having some template that takes 2 argument, a stack element and its reverse and check inversion and double inversion on these. Then you can essentially express that part of the test as test vectors. You can have generated test vectors as well.

This revision now requires changes to proceed.Sat, Jan 11, 17:26

Renamed to OP_ENDIAN_REVERSE, reasons can be found in the PR:
In a previous proposal, this opcode has been named OP_REVERSE. After that, it has been renamed to OP_BSWAP, as that is a more technically accurate term, which is commonly used for reversing the byteorder of integers [14] [15]. However, after some more consideration, it has been renamed to OP_ENDIAN_REVERSE following Boost‘s nomenclature [16]. This is because OP_BSWAP is lexically very similar to the already existing OP_SWAP and would make Script harder to read. Also, while the technical term for the instruction is indeed bswap, it isn‘t well known for developers of higher level languages and could thus spark confusion that would be avoided by using the name OP_ENDIAN_REVERSE, which is more self-descriptive.
Also added tests for unexecuted branches, clarified tests for not yet activated case.
Merged D4859 into this diff.

I'm not a fan of the new name... "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes.... If you're worried about confusion, using the word "Endian" to me seems more confusing. It's possible people may use for things other than endian conversion, since it can reverse the byte order for any arbitrary-length string.

How about "OP_REVERSE_BYTES", or "OP_BYTE_SWAP"?

In any case, I guess this is a bit bike-sheddy, so not super-important either way.

Renamed to OP_ENDIAN_REVERSE, reasons can be found in the PR:
In a previous proposal, this opcode has been named OP_REVERSE. After that, it has been renamed to OP_BSWAP, as that is a more technically accurate term, which is commonly used for reversing the byteorder of integers [14] [15]. However, after some more consideration, it has been renamed to OP_ENDIAN_REVERSE following Boost‘s nomenclature [16]. This is because OP_BSWAP is lexically very similar to the already existing OP_SWAP and would make Script harder to read. Also, while the technical term for the instruction is indeed bswap, it isn‘t well known for developers of higher level languages and could thus spark confusion that would be avoided by using the name OP_ENDIAN_REVERSE, which is more self-descriptive.
Also added tests for unexecuted branches, clarified tests for not yet activated case.
Merged D4859 into this diff.

I'm not a fan of the new name... "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes.... If you're worried about confusion, using the word "Endian" to me seems more confusing. It's possible people may use for things other than endian conversion, since it can reverse the byte order for any arbitrary-length string.
How about "OP_REVERSE_BYTES", or "OP_BYTE_SWAP"?
In any case, I guess this is a bit bike-sheddy, so not super-important either way.

https://github.com/bitcoincashorg/bitcoincash.org/pull/358

As BigBlockIfTrue pointed out, we should use a name that follows the same scheme existing opcodes use. I think OP_REVERSEBYTES might be the best choice here. Your argument about endianess makes sense

markblundeberg added a comment.EditedSun, Jan 12, 02:12

I'm not a fan of the new name... "Endian" to me seems hardware specific, and usually implies some power-of-2 number of bytes....

Yeah the same thing was bugging me, but I couldn't quite explain what it was. (Also remember that technically there are things besides big-endian and little-endian. ;-P )

OP_REVERSEBYTES does sound nice.

tobias_ruck added inline comments.Tue, Jan 14, 16:16
src/test/data/script_tests.json
964 ↗(On Diff #15288)

well spotted.

src/test/op_endian_reverse_tests.cpp
103 ↗(On Diff #15288)

Why? Should I use some variadic generics magic here?

tobias_ruck updated this revision to Diff 15459.Tue, Jan 14, 17:31

Renamed opcode to OP_REVERSEBYTES (which is the final name now), refactored & amended tests.

Snippet of first build failure:

[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git config core.sparseCheckout true
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git config http.sslCAInfo
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git show-ref
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git show-ref refs/tags/phabricator/diff/15459
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git log -n1 --pretty=format:%H%x20%s 9dc27d55a40e7fe6acb5ff7b28bdb6d54423860b --
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git -c credential.helper= checkout -q -f phabricator/diff/15459
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git show-ref refs/tags/phabricator/diff/15459
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] Cleaning Bitcoin ABC Staging in /home/teamcity/buildAgent/work/c4a5708f2bae7929 the file set ALL_UNTRACKED
[17:32:02] :			 [Update checkout directory (/home/teamcity/buildAgent/work/c4a5708f2bae7929)] /usr/bin/git clean -f -d -x
[17:32:02] : Build preparation done
[17:32:02]E: Step 1/1: Command Line
[17:32:02] :	 [Step 1/1] Ant JUnit report watcher
[17:32:02] :		 [Ant JUnit report watcher] Watching paths:
[17:32:02] :		 [Ant JUnit report watcher] +:build/test_bitcoin.xml
[17:32:02] :		 [Ant JUnit report watcher] +:test/functional/junit_results.xml
[17:32:02] :		 [Ant JUnit report watcher] +:build/junit_results*.xml
[17:32:02] :	 [Step 1/1] Starting: /home/teamcity/buildAgent/temp/agentTmp/custom_script135283986362505923
[17:32:02] :	 [Step 1/1] in directory: /home/teamcity/buildAgent/work/c4a5708f2bae7929
[17:32:02] :	 [Step 1/1] Running build configuration 'build-diff'...
[17:32:02]W:	 [Step 1/1] + : build-diff
[17:32:02]W:	 [Step 1/1] + '[' -z build-diff ']'
[17:32:02]W:	 [Step 1/1] + echo 'Running build configuration '\''build-diff'\''...'
[17:32:02]W:	 [Step 1/1] ++ git rev-parse --show-toplevel
[17:32:02]W:	 [Step 1/1] + TOPLEVEL=/home/teamcity/buildAgent/work/c4a5708f2bae7929
[17:32:02]W:	 [Step 1/1] + export TOPLEVEL
[17:32:02]W:	 [Step 1/1] + trap print_sanitizers_log ERR
[17:32:02]W:	 [Step 1/1] +++ dirname ./contrib/teamcity/build-configurations.sh
[17:32:02]W:	 [Step 1/1] ++ cd ./contrib/teamcity
[17:32:02]W:	 [Step 1/1] ++ pwd
[17:32:02]W:	 [Step 1/1] + CI_SCRIPTS_DIR=/home/teamcity/buildAgent/work/c4a5708f2bae7929/contrib/teamcity
[17:32:02]W:	 [Step 1/1] + setup
[17:32:02]W:	 [Step 1/1] + : /home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] + mkdir -p /home/teamcity/buildAgent/work/c4a5708f2bae7929/build/output
[17:32:02]W:	 [Step 1/1] ++ cd /home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] ++ pwd
[17:32:02]W:	 [Step 1/1] + BUILD_DIR=/home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] + export BUILD_DIR
[17:32:02]W:	 [Step 1/1] + TEST_RUNNER_FLAGS=--tmpdirprefix=output
[17:32:02]W:	 [Step 1/1] + cd /home/teamcity/buildAgent/work/c4a5708f2bae7929/build
[17:32:02]W:	 [Step 1/1] ++ nproc
[17:32:02]W:	 [Step 1/1] + THREADS=4
[17:32:02]W:	 [Step 1/1] + export THREADS
[17:32:02]W:	 [Step 1/1] + SAN_SUPP_DIR=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions
[17:32:02]W:	 [Step 1/1] + SAN_LOG_DIR=/tmp/sanitizer_logs
[17:32:02]W:	 [Step 1/1] + mkdir -p /tmp/sanitizer_logs
[17:32:02]W:	 [Step 1/1] + rm -rf '/tmp/sanitizer_logs/*'
[17:32:02]W:	 [Step 1/1] + export ASAN_OPTIONS=malloc_context_size=0:log_path=/tmp/sanitizer_logs/asan.log
[17:32:02]W:	 [Step 1/1] + ASAN_OPTIONS=malloc_context_size=0:log_path=/tmp/sanitizer_logs/asan.log
[17:32:02]W:	 [Step 1/1] + export LSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/lsan:log_path=/tmp/sanitizer_logs/lsan.log
[17:32:02]W:	 [Step 1/1] + LSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/lsan:log_path=/tmp/sanitizer_logs/lsan.log
[17:32:02]W:	 [Step 1/1] + export TSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/tsan:log_path=/tmp/sanitizer_logs/tsan.log
[17:32:02]W:	 [Step 1/1] + TSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/tsan:log_path=/tmp/sanitizer_logs/tsan.log
[17:32:02]W:	 [Step 1/1] + export UBSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:log_path=/tmp/sanitizer_logs/ubsan.log
[17:32:02]W:	 [Step 1/1] + UBSAN_OPTIONS=suppressions=/home/teamcity/buildAgent/work/c4a5708f2bae7929/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:log_path=/tmp/sanitizer_logs/ubsan.log
[17:32:02]W:	 [Step 1/1] + case "$ABC_BUILD_NAME" in
[17:32:02] :	 [Step 1/1] Error: Invalid build name 'build-diff'
[17:32:02]W:	 [Step 1/1] + echo 'Error: Invalid build name '\''build-diff'\'''
[17:32:02]W:	 [Step 1/1] + exit 2
[17:32:02]W:	 [Step 1/1] Process exited with code 2
[17:32:02]E:	 [Step 1/1] Process exited with code 2 (Step: Command Line)
tobias_ruck added inline comments.Tue, Jan 14, 17:33
src/test/op_reversebytes_tests.cpp
112 ↗(On Diff #15459)

Can someone confirm that this is not UB? When will data be freed?

Build Bitcoin-ABC / Diffs / Diff Testing failed. Snippet of first build failure: ` [17:32:02]W: [Step 1/1] + case "$ABC_BUILD_NAME" in [17:32:02] : [Step 1/1] Error: Invalid build name 'build-diff' [17:32:02]W: [Step 1/1] + echo 'Error: Invalid build name '\''build-diff'\''' [17:32:02]W: [Step 1/1] + exit 2 [17:32:02]W: [Step 1/1] Process exited with code 2 [17:32:02]E: [Step 1/1] Process exited with code 2 (Step: Command Line) `

This seems like a configuration error. Does someone know what‘s going on there?

Build Bitcoin-ABC / Diffs / Diff Testing failed. Snippet of first build failure: ` [17:32:02]W: [Step 1/1] + case "$ABC_BUILD_NAME" in [17:32:02] : [Step 1/1] Error: Invalid build name 'build-diff' [17:32:02]W: [Step 1/1] + echo 'Error: Invalid build name '\''build-diff'\''' [17:32:02]W: [Step 1/1] + exit 2 [17:32:02]W: [Step 1/1] Process exited with code 2 [17:32:02]E: [Step 1/1] Process exited with code 2 (Step: Command Line) `

This seems like a configuration error. Does someone know what‘s going on there?

Hey, I think @jasonbcox tweaked something in the CI, so pull the latest master and rebase your commit on that.

tobias_ruck updated this revision to Diff 15464.Tue, Jan 14, 20:39

Rebased onto the changes from master.

That fixed it. Thanks Mark!

Tiny quibble but otherwise good.

src/test/data/script_tests.json
1278 ↗(On Diff #15464)

Hmm, actually the prior versions of this test and the one below still pass just fine, right? Maybe we can just leave them unmodified for now, that is the best proof of not changing behaviour. I like that these ones explicitly mention 0xbc.

tobias_ruck updated this revision to Diff 15658.Sat, Jan 18, 19:07

Fixed the, and I quote, quibble, which has been pointed out by @markblundeberg

deadalnix requested changes to this revision.Sun, Jan 19, 17:31

A few minor improvement, but it's getting there. You should consider starting to work on activation logic and tests.

src/script/interpreter.cpp
1250 ↗(On Diff #15464)

I'd remove that empty line, but it's not that important.

src/test/data/script_tests.json
1278 ↗(On Diff #15464)

This would benefit from only updating the comment to reflect the new state of the code, without changing the tests, indeed.

src/test/op_reversebytes_tests.cpp
88 ↗(On Diff #15464)

likestamp

101 ↗(On Diff #15464)

likestamp

104 ↗(On Diff #15464)

You probably want to use size_t here

109 ↗(On Diff #15464)

emplace_back and you won't need to cast

112 ↗(On Diff #15464)

Note that you are making a copy of reversed_data here and then immediately destroying it. This is not a huge deal, but not doing so is actually less code, so I'm wondering what's up.

test_cases.push_back({data, {data.rbegin(), data.rend()}});
116 ↗(On Diff #15464)

Why not use the same outer loop as the previous one?

121 ↗(On Diff #15464)

dito

136 ↗(On Diff #15464)
for(const ReverseTestCase&test_case : test_cases) { ... }
143 ↗(On Diff #15464)

You probably want to run this guy over a few palindromes to check edge cases. 0, odd, even and maximum size come to mind, and do it in the flag loop.

This revision now requires changes to proceed.Sun, Jan 19, 17:31

@tobias_ruck Could you change the name to OP_REVERSEBYTES in the Diff title?

(Also, it's not really necessary to mention tests in the title..... it should be a given that every Diff contains the appropriate tests)

tobias_ruck updated this revision to Diff 15714.Tue, Jan 21, 14:42
tobias_ruck marked an inline comment as done.

Removed empty line in interpreter.cpp. Fused loops in op_reversebytes_test.cpp and other refactorings, expanded palindromes test vector.

tobias_ruck retitled this revision from Added OP_ENDIAN_REVERSE+implementation, added (always disabled) activation flag, added tests for OP_ENDIAN_REVERSE. to Added OP_REVERSEBYTES+implementation, added (always disabled) activation flag.Tue, Jan 21, 14:47
tobias_ruck edited the summary of this revision. (Show Details)
tobias_ruck edited the test plan for this revision. (Show Details)
tobias_ruck updated this revision to Diff 15716.Tue, Jan 21, 14:53

Updated comments

tobias_ruck added inline comments.Tue, Jan 21, 15:19
src/test/op_reversebytes_tests.cpp
112 ↗(On Diff #15464)

what‘s going on it that I‘m used to Rust‘s move semantics ;)
thanks for clarifying that

143 ↗(On Diff #15464)

Go Hang a Salami. I'm a Lasagna Hog!