Specialized double-SHA256 with 64 byte inputs with SSE4.1 and AVX2
Summary:
- Benchmark Merkle root computation
- Refactor SHA256 code
- Specialized double sha256 for 64 byte inputs
- Use SHA256D64 in Merkle root computation
- 4-way SSE4.1 implementation for double SHA256 on 64-byte inputs
- 8-way AVX2 implementation for double SHA256 on 64-byte inputs
- [MOVEONLY] Move unused Merkle branch code to tests
- Enable double-SHA256-for-64-byte code on 32-bit x86
- For AVX2 code, also check for AVX, XSAVE, and OS support
This is a backport of Core PR13191, PR13393 and PR13471
Test Plan:
make check
Reviewers: #bitcoin_abc, schancel
Reviewed By: #bitcoin_abc, schancel
Subscribers: teamcity
Differential Revision: https://reviews.bitcoinabc.org/D1846