Page MenuHomePhabricator

Add ECMH multiset module to libsecp256k1
ClosedPublic

Authored by tomtomtom7 on Feb 9 2018, 12:55.

Details

Reviewers
deadalnix
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Restricted Project
Commits
rSTAGINGf4f00f4ed342: Add ECMH multiset module to libsecp256k1
rABCf4f00f4ed342: Add ECMH multiset module to libsecp256k1
Summary

This can be used for UTXO commitments.

Specification at
https://github.com/tomasvdw/bips/blob/master/ecmh.mediawiki

Test Plan

Tests included

Diff Detail

Repository
rABC Bitcoin ABC
Branch
arcpatch-D1072
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 1977
Build 2103: Bitcoin ABC Buildbot (legacy)
Build 2102: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
src/secp256k1/src/bench_multiset.c
48 ↗(On Diff #2825)

Copy paste mistake. Fixed

src/secp256k1/src/modules/multiset/main_impl.h
65 ↗(On Diff #2825)

Good point. That is much better. I have changed this.

101 ↗(On Diff #2825)

The normalizations here seem to be needed, as it crashes otherwise; note that these are field normalizations, not GE normalizations. The add leaves the fields less normalized.

123 ↗(On Diff #2825)

i combined the two methods.

170 ↗(On Diff #2825)

Changed to 0.

Left the infinite as per discussion above.

I am a bit uncertain about finalize. Is there actually a point in hashing?

We could also just return the X value. This means set A collides with "negative" set A but negative sets are only intermediates, so if you are finalizing them something is already terribly wrong.

src/secp256k1/src/modules/multiset/main_impl.h
47 ↗(On Diff #2825)

The problem is that you skew the proba chaining that way. The hash that result from a hash of a high value will be disproportionally reoresented. It is also harder to create collision via extension attack with double sha.

If you are worried about the bound, make the integer 64 bits.

41 ↗(On Diff #2839)

remove

Using suggested trial-and-hash algorithm with incrementing prefix

Please make sure the style match the rest of libsecp256k1. The comments and various other things do not match. They are also various weird empty lines.

Except that and the test that needs some improvement, I think we have a great piece of software here. I'd like for someone else to review this as it's very critical.

src/secp256k1/src/modules/multiset/README.md
68 ↗(On Diff #2854)

This is also the description of the concept of multiset hash itself, so should probably be presented as such. Even if they use a different curve, and so a different algorithm to hash onto a point on the curve, we don't really care about that part.

src/secp256k1/src/modules/multiset/tests_impl.h
237 ↗(On Diff #2854)

About the test, you still to add explanation about what are the scenarii tested, because it's not very clear. It also needs to have various tests involving results of specific function so someone can check a 3rd party implementation against this.

  • Better align comment/indent styling
  • Fix test-failing typo in new hash-to-EC-point algorithm
  • Clarify reference to PDF as source of ECMH algorithm
  • Add more test comments
  • Add duplicate element tests

TODO

  • Update spec
  • Add fixed-value testvectors

Add test vectors as included in the specification

One thing I forgot: cmake build. See CmakeLists.txt

Here is the relevant discussion on Bitcoin Core, just for reference.
https://github.com/bitcoin-core/secp256k1/pull/477

Add cmake switch for multiset module

Enable multiset module in cmake configuration

Fix in cmake file for multiset module

LGTM

src/secp256k1/src/modules/multiset/main_impl.h
25 ↗(On Diff #2864)

Is the autoformatter turned off on this file? This should be on the previous line.

172 ↗(On Diff #2935)

@matiu Suggested possibly encoding as a public key rather than hashing the points in a buffer.

ACK: Code review, testing, and a lot of playing...

My only suggestion is to consider using compressed pub key encoding for finalizing instead of hashing. Then the result will be 33bytes long (consisting of 0x02 (Y is even) or 0x03 (Y is odd), followed by 32 bytes for the X coordinate.). It depends whether we want the multiset to be able to continue to be built after finalization (ie: if you receive a multiset from the networt if you will be able to continue building from it)

deadalnix requested changes to this revision.Feb 22 2018, 18:06

This still needs to build with cmake.

src/secp256k1/src/bench_multiset.c
17 ↗(On Diff #2935)

why static ?

31 ↗(On Diff #2935)

braces

src/secp256k1/src/modules/multiset/README.md
81 ↗(On Diff #2935)

You should swap 3 and 4

This revision now requires changes to proceed.Feb 22 2018, 18:06
  • Fixed braces
  • Removed static from bench
  • Reordered references in readme

This still needs to build with cmake.

Curious yet not surprising.

My first introduction to CMake has resulted in a long and bloody war. I thought I had won, but the bastard keeps pulling new tricks from its sleeve.

I'll give it another try.

I think we should just return an all zero vector for the empty set instead of hashing zeros. In addition, providing test vector for ge_from_data_var is valuable, IMO as it'll help anyone who wish to implement.

src/secp256k1/src/modules/multiset/main_impl.h
50 ↗(On Diff #3007)

Please add test to ensure that this produce expected values. This is useful as reference for 3rd party implementation.

164 ↗(On Diff #3007)

Just return after this. There is no point hashing zeros. Plus we know the empty set is all zeros, which is convenient.

src/secp256k1/src/modules/multiset/tests_impl.h
274 ↗(On Diff #3007)

Is there a reason why all the vectors are the same length ?

tomtomtom7 marked an inline comment as done.

Return 0 instead of SHA(0) for the empty multiset

src/secp256k1/src/modules/multiset/main_impl.h
50 ↗(On Diff #3007)

This is currently done with the testvector m1,m2,m3 in the tests. They are tested after adding to empty (a noop) and hashing.

If we'd test them directly from this function we would be using Jacobian x,y,z coordinates, which are implementation specific; I would think it is better to leave the multiset variable as a black box as there is no reason other implementations would use the same structure.

The (x,y) coordinates and individual loop steps from this function are in the spec:
https://github.com/tomasvdw/bips/blob/master/ecmh.mediawiki

164 ↗(On Diff #3007)

Good point. I fixed this as suggested.

src/secp256k1/src/modules/multiset/tests_impl.h
274 ↗(On Diff #3007)

I am using "real" UTXOs, the coinbase transactions of blocks 1,2,3 as testvectors (at least, how I will propose we serialize them). This allows us to easily use these testvectors also from rpc tests.

I realize that this means they are the same length and some bytes are the same, but I don't think this is a limitation that outweighs the benefit. SHA hashing different input sizes is well tested already.

I don't think I have anything to add here. It looks good to me. I will leave some time for someone else to shime in before accepting, due to the fact this code is sensible, but I do not have anything else to add.

src/secp256k1/src/modules/multiset/tests_impl.h
309 ↗(On Diff #3032)

Except maybe checking the value of the empty set in there.

Add testvector test for empty set

Changed testvector 3 input and output

This way it correctly represents the serialization of UTXOs and
can be used in the RPC tests

This revision is now accepted and ready to land.Apr 8 2018, 14:05
This revision was automatically updated to reflect the committed changes.