Yesterday I modded btcd to use a SIMD SHA256 hash function and greatly increased its parallel threads to 3 per physical thread and set to run with 300% normal garbage collector heap allocation.
so, in short terms, it preallocates a lot more memory and it runs triple the normal number of threads (which is set to the number of CPU threads, which is half the number of cores, so that's 4 cores on my lenovo ideapad3 with ryzen 5, meaning 8 threads and 24 parallel goroutines).
the first of these is especially good but the three of them show the performance boost in initial block download and validating the transactions and putting them into the database is way faster:
2023-10-25 07:48:38.126 [INF] SYNC: Processed 11 blocks in the last 10.66s (47083 transactions, height 797983, 2023-07-09 15:53:47 -0100 -01)
2023-10-25 07:48:49.143 [INF] SYNC: Processed 7 blocks in the last 11.01s (20232 transactions, height 797990, 2023-07-09 17:01:00 -0100 -01)
2023-10-25 07:48:59.190 [INF] SYNC: Processed 8 blocks in the last 10.04s (27234 transactions, height 797998, 2023-07-09 18:01:45 -0100 -01)
47k transactions in 10 seconds means it's doing 4700 tx/s, which is nearly 2.5% the vanilla configuration with threads set to limit at standard and GC set to 100.
clearly also the use of the SIMD SHA256 is helping as well.
some more, coming through now:
2023-10-25 07:54:15.237 [INF] SYNC: Processed 10 blocks in the last 10.04s (39948 transactions, height 798219, 2023-07-11 04:35:03 -0100 -01)
2023-10-25 07:54:26.682 [INF] SYNC: Processed 9 blocks in the last 11.44s (37835 transactions, height 798228, 2023-07-11 05:38:01 -0100 -01)
2023-10-25 07:54:36.739 [INF] SYNC: Processed 6 blocks in the last 10.05s (21489 transactions, height 798234, 2023-07-11 06:21:06 -0100 -01)
2023-10-25 07:54:46.868 [INF] SYNC: Processed 7 blocks in the last 10.12s (20932 transactions, height 798241, 2023-07-11 07:22:57 -0100 -01)
as you can see, it's keeping well above 2000tx/s throughput, hitting 4000tx/s often.
this is nearly a doubling of performance. 100% worth it and only took about 60 lines of code to change it this way.
i also had proposed to have part of this (not the SIMD SHA256 library) added to btcd over a year ago.
nobody cares, apparently.