Unroll the ChaCha20 inner loop for performance
Summary:
This is a backport of core#24946
before:
| ns/byte | byte/s | err% | total | benchmark |
| --------------------: | --------------------: | --------: | ----------: | :---------- |
| 1.06 | 943,492,622.22 | 0.2% | 0.01 | CHACHA20_1MB |
| 1.07 | 931,649,398.02 | 0.1% | 0.01 | CHACHA20_256BYTES |
| 1.09 | 915,389,200.84 | 0.3% | 0.01 | CHACHA20_64BYTES |
| 2.84 | 352,270,814.64 | 0.2% | 0.03 | CHACHA20_POLY1305_AEAD_1MB_ENCRYPT_DECRYPT |
| 1.42 | 703,771,093.98 | 0.1% | 0.02 | CHACHA20_POLY1305_AEAD_1MB_ONLY_ENCRYPT |
| 3.80 | 262,822,693.77 | 1.0% | 0.01 | CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT |
| 1.92 | 521,278,538.03 | 0.4% | 0.01 | CHACHA20_POLY1305_AEAD_256BYTES_ONLY_ENCRYPT |
| 6.61 | 151,253,004.43 | 0.2% | 0.01 | CHACHA20_POLY1305_AEAD_64BYTES_ENCRYPT_DECRYPT |
| 3.32 | 301,294,073.34 | 0.2% | 0.01 | CHACHA20_POLY1305_AEAD_64BYTES_ONLY_ENCRYPT |
after:
| ns/byte | byte/s | err% | total | benchmark |
| --------------------: | --------------------: | --------: | ----------: | :---------- |
| 1.02 | 979,048,846.19 | 0.0% | 0.01 | CHACHA20_1MB |
| 1.04 | 957,913,356.57 | 0.2% | 0.01 | CHACHA20_256BYTES |
| 1.08 | 923,578,665.64 | 0.3% | 0.01 | CHACHA20_64BYTES |
| 2.77 | 360,586,442.66 | 0.1% | 0.03 | CHACHA20_POLY1305_AEAD_1MB_ENCRYPT_DECRYPT |
| 1.39 | 721,889,319.70 | 0.1% | 0.02 | CHACHA20_POLY1305_AEAD_1MB_ONLY_ENCRYPT |
| 3.71 | 269,635,408.14 | 0.1% | 0.01 | CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT |
| 1.86 | 537,671,468.95 | 0.2% | 0.01 | CHACHA20_POLY1305_AEAD_256BYTES_ONLY_ENCRYPT |
| 6.53 | 153,238,853.30 | 0.3% | 0.01 | CHACHA20_POLY1305_AEAD_64BYTES_ENCRYPT_DECRYPT |
| 3.27 | 305,392,217.41 | 0.1% | 0.01 | CHACHA20_POLY1305_AEAD_64BYTES_ONLY_ENCRYPT |
Test Plan: ninja all check-all bench-bitcoin
Reviewers: #bitcoin_abc, Fabien
Reviewed By: #bitcoin_abc, Fabien
Subscribers: Fabien
Differential Revision: https://reviews.bitcoinabc.org/D18815