ok, so, i removed all the logging bits in my encoder because i discovered that it was a massive part of the benchmark timing information (and performance), and i was pleased to see this result:

cpu: AMD Ryzen 5 PRO 4650G with Radeon Graphics

BenchmarkBinaryEncoding

BenchmarkBinaryEncoding/event2.EventToBinary

BenchmarkBinaryEncoding/event2.EventToBinary-12 788 1467852 ns/op

BenchmarkBinaryEncoding/gob.Encode

BenchmarkBinaryEncoding/gob.Encode-12 94 13787646 ns/op

BenchmarkBinaryEncoding/binary.Marshal

BenchmarkBinaryEncoding/binary.Marshal-12 48 24622387 ns/op

BenchmarkBinaryDecoding

BenchmarkBinaryDecoding/event2.BinaryToEvent

BenchmarkBinaryDecoding/event2.BinaryToEvent-12 600 1957643 ns/op

BenchmarkBinaryDecoding/gob.Decode

BenchmarkBinaryDecoding/gob.Decode-12 27 43210743 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal

BenchmarkBinaryDecoding/binary.Unmarshal-12 852 1379193 ns/op

I've slimmed it down to mine, nostr:npub180cvv07tjdrrgpa0j7j7tmnyl2yr6yr7l8j4s3evf6u64th6gkwsyjh6w6 and the built in Gob decoder from the Go standard library

my binary encoding is more than 20x as fast, i did not expect it to be that much faster, but ok, not sure if that's because i reused the decode write buffer

the decoder is 600 ops versus 852 ops, so it's slower but not that much slower

here it is again except with using 10,000 of those sample events, numbers much lower of course but may be more accurate due to statistical variations

cpu: AMD Ryzen 5 PRO 4650G with Radeon Graphics

BenchmarkBinaryEncoding

BenchmarkBinaryEncoding/event2.EventToBinary

BenchmarkBinaryEncoding/event2.EventToBinary-12 139 8155043 ns/op

BenchmarkBinaryEncoding/gob.Encode

BenchmarkBinaryEncoding/gob.Encode-12 19 62083345 ns/op

BenchmarkBinaryEncoding/binary.Marshal

BenchmarkBinaryEncoding/binary.Marshal-12 9 112449500 ns/op

BenchmarkBinaryDecoding

BenchmarkBinaryDecoding/event2.BinaryToEvent

BenchmarkBinaryDecoding/event2.BinaryToEvent-12 100 13966114 ns/op

BenchmarkBinaryDecoding/gob.Decode

BenchmarkBinaryDecoding/gob.Decode-12 5 223875643 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal

BenchmarkBinaryDecoding/binary.Unmarshal-12 121 9243299 ns/op

the encoding is a little less, around 14x the decoding, however, comes out to only 20% less in this test, which is an extra 5%

anyway, everyone knows boys like pissing contests, i'm no different

one thing this has revealed to me is that for purposes of benchmarking, at least, i need a way to disable the logging more completely, this is with all logging overhead utterly removed, the difference is amazing and shows how expensive my logging is in runtime throughput

so, it took all morning, but i changed one of those stupid string fields to bytes now, the ID

cpu: AMD Ryzen 5 PRO 4650G with Radeon Graphics

BenchmarkBinaryEncoding/event2.MarshalJSON-12 14 73695929 ns/op

BenchmarkBinaryEncoding/event2.EventToBinary-12 157 7280781 ns/op

BenchmarkBinaryEncoding/easyjson.Marshal-12 64 19049795 ns/op

BenchmarkBinaryEncoding/gob.Encode-12 18 62296062 ns/op

BenchmarkBinaryEncoding/binary.Marshal-12 10 110174020 ns/op

BenchmarkBinaryDecoding/event2.BinaryToEvent-12 100 13698489 ns/op

BenchmarkBinaryDecoding/easyjson.Unmarshal-12 56 19677698 ns/op

BenchmarkBinaryDecoding/gob.Decode-12 5 226144867 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal-12 122 9011042 ns/op

BenchmarkBinaryDecoding/binary.UnmarshalBinary-12 277 4849297 ns/op

BenchmarkBinaryDecoding/easyjson.Unmarshal+sig-12 1 1624027475 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal+sig-12 1 1596453002 ns/op

just one of the fields, and from 139 to 157 - makes sense because it's not decoding hex anymore

i'm surprised the decode didn't get faster, since it now literally only makes a new slice header (subslicing)

small result for a lot of work but the pubkey and signature are still to come

nostr:npub180cvv07tjdrrgpa0j7j7tmnyl2yr6yr7l8j4s3evf6u64th6gkwsyjh6w6 btw the UnmarshalBinary function you wrote is extremely fast, but the reason why its counterpart MarshalBinary isn't in there is because it can't deal with events over 64kb in size, otherwise i'd re-enable it

Reply to this note

Please Login to reply.

Discussion

yes, it's not printing errors but the speed of that is because it's actually skipping a shit-ton of things

i'm gonna just remove it

cpu: AMD Ryzen 5 PRO 4650G with Radeon Graphics

BenchmarkBinaryEncoding/event2.MarshalJSON-12 15 74245789 ns/op

BenchmarkBinaryEncoding/event2.EventToBinary-12 162 7429203 ns/op

BenchmarkBinaryEncoding/easyjson.Marshal-12 60 21221235 ns/op

BenchmarkBinaryEncoding/gob.Encode-12 18 62426577 ns/op

BenchmarkBinaryEncoding/binary.Marshal-12 9 112783137 ns/op

BenchmarkBinaryDecoding/event2.UnmarshalJSON-12 14 90464147 ns/op

BenchmarkBinaryDecoding/event2.BinaryToEvent-12 100 12832888 ns/op

BenchmarkBinaryDecoding/easyjson.Unmarshal-12 57 23280946 ns/op

BenchmarkBinaryDecoding/gob.Decode-12 5 226555916 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal-12 128 8563056 ns/op

the output from https://mleku.dev/nostrbench

now with the pubkey encoding as binary... not as dramatic an improvement it seems

oh well, anyway, it's done now, both id and pubkey fields now do not need any hex encode/decode so that's still a good thing

last is the signature... this will be another 5-10 ops more i figure, then i'm gonna get out the profiler

goos: linux

goarch: amd64

pkg: mleku.net/nostrbench

cpu: AMD Ryzen 5 PRO 4650G with Radeon Graphics

BenchmarkBinaryEncoding/event2.MarshalJSON-12 14 72883244 ns/op

BenchmarkBinaryEncoding/event2.EventToBinary-12 166 7090886 ns/op

BenchmarkBinaryEncoding/easyjson.Marshal-12 67 18620409 ns/op

BenchmarkBinaryEncoding/gob.Encode-12 19 62998941 ns/op

BenchmarkBinaryEncoding/binary.Marshal-12 10 110464204 ns/op

BenchmarkBinaryDecoding/event2.UnmarshalJSON-12 13 87762713 ns/op

BenchmarkBinaryDecoding/event2.BinaryToEvent-12 100 12388974 ns/op

BenchmarkBinaryDecoding/easyjson.Unmarshal-12 57 23463646 ns/op

BenchmarkBinaryDecoding/gob.Decode-12 5 226411371 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal-12 123 8851251 ns/op

#devstr #benchmark #nostr

since switching up the binary encoder for sure the relay is running way way faster... also because i fixed the shit slow logging library, that was a big part of it, none of these other libraries even have logging! if mine has an error it prints a log! so it's still faster and dev friendly at the same time

bah humbug of course i has bugs now, very weirdness

it looks like i broke the filters, that would be why it returns zero counts all the time and zero results get sent, and then idk why not sending eose either, but hey gotta expect gotchas like this... i'll fix em tomorrow most likely, and become aware of how they got broken so when i do the signatures i don't get surprised like this

it does send out the messages to subscribers when i post it but filter searches are b0rked

ok, bug found... funny enough the first change i made was bugged... the event IDs... was all wrong totally... only took a few minutes to get it generating the correct output and plugging that into the filters and suddenly all these results are coming back from the filters coming in from clients

also, the bug was slowing down the encoder... this is now typical results

goos: linux

goarch: amd64

pkg: mleku.net/nostrbench

cpu: AMD Ryzen 5 PRO 4650G with Radeon Graphics

BenchmarkBinaryEncoding/event2.MarshalJSON-12 14 74456910 ns/op

BenchmarkBinaryEncoding/event2.EventToBinary-12 172 6911613 ns/op

BenchmarkBinaryEncoding/easyjson.Marshal-12 63 18776462 ns/op

BenchmarkBinaryEncoding/gob.Encode-12 19 62966594 ns/op

BenchmarkBinaryEncoding/binary.Marshal-12 9 118801192 ns/op

BenchmarkBinaryDecoding/event2.UnmarshalJSON-12 14 88550098 ns/op

BenchmarkBinaryDecoding/event2.BinaryToEvent-12 100 12272410 ns/op

BenchmarkBinaryDecoding/easyjson.Unmarshal-12 58 23138193 ns/op

BenchmarkBinaryDecoding/gob.Decode-12 5 227175445 ns/op

BenchmarkBinaryDecoding/binary.Unmarshal-12 126 8782942 ns/op

i'm still puzzled why the BinaryToEvent is unaffected though... gonna get to it yet, for now, all is well with the world, my relay is actually finding events and sending them back

actually, no, i only fixed one case, looks like the pubkeys are broken