Replying to Avatar Daniel Wigton

It depends on where you are in the tech stack. UTF-8 or even ASCII is not human readable, they are bytes that need to be interpreted as such and rendered as glyphs via a lookup table and a graphics engine.

If required, you can do the same for any enumerated type. The nice thing about text encodings is that they are widely accepted and have many implementations of renderers.

But if you are doing the tech right, most of your base protocol is not going to be human readable, not because legibility is undesirable, but because it is nigh impossible.

Think of the raw data on the line. You need source and destination IP addresses and port numbers. Then you need something like source and destination node IDs, an ephemeral pubkey or nonce (depending on your primitives)

The rest is just gibberish because it is encrypted. None of that can be easily made parsible by humans.

Next you have the task of turning that encrypted blob into something useful. You need more keys and signatures etc. Eventually you get some final decrypted unpackaged data that you hand off to the client application. The underlying protocol doesn't care what the bytes it encapsulates/decapsulates are. It can't, if it knew anything about it, you'd be leaking metadata for men in the middle to hoover up.

Once you get to the client application, I agree with you. You want your data to be human friendly.

Generally I am a fan of being careful of how complex data in social media etc gets interpreted because people tell lies. For instance, do you translate an npub into the name its owner chooses, the name the viewer chooses, or a name the community chooses? Same with profile picture.

The first sounds good, but you get impersonation or other lies. Numbers on the wire have to be interpreted the way the recipient wants, not the way the sender wants.

the decoder ring for UTF-8 is always available. a decoder ring for the meaning of kind numbers or BIPs is not.

you don't even need type values if your structure is rigid, like nostr events, you just define an encoding order, and if the field is fixed length it's redundant to have a length prefix for it.

also mentioning network addresses, even those are handled in a human readable form, and not the native. native form of an IPv4 address is a 4 byte for address and 2 bytes for the port.

anyhow, yeah, of course, you use whatever fits the task best but one of the biggest reasons i favor mnemonic sentinels in binary data is that these are stable values. if you define a series of numbers and a whole bunch of them become deprecated, there's holes in the number space, you can't reuse them without a breaking change. a 2 or 3 character sentinel (or even 4) is never going to change, and has enough space so that you can just add more with such as an extra character signifying version or whatever.

Reply to this note

Please Login to reply.

Discussion

No replies yet.