Nostr Web Client

Something is off .. have to remove the two special chars in the message to match the id. Are we supposed to remove nonutf8 chars?

Peter Todd 2y ago

You certainly should not, and if the spec works that way, the spec is wrong. Don't display anything that isn't hashed or otherwise verified.

Reply to this note

Please Login to reply.

Discussion

Vitor Pamplona 2y ago

Some weird String to ByteArray differences going on... Looks like Nostr follows whatever JavaScript is doing.

Peter Todd 2y ago

Nostr's underlying serialization format is a really stupid thing to use for a cryptographic system. I won't be surprised if this has something to do with the bug.

Vitor Pamplona 2y ago

ok, now I understand it.

It's about escaping the content field to create the JSON string. Some implementations are escaping newline char into "\n" (2 chars) but not other characters like LSEP into \u2028 (6 chars) before hashing.

Everything is fine in interfaces because they automatically space and unescape things.

But the hash is sensitive to those differences.

The issue is that some implementations PARTIALLY escape some characters and not escape others.

I am not sure if there is a standard for these choices.

Vitor Pamplona 2y ago

Turns out, we hit something as new as last week: https://github.com/google/gson/issues/2295

JSON lib devs are struggling to figure out a standard to escape or not escape \u2028 and \u2029

#[4] we need to clarify how to escape the String of content in NIP-01 before the string is converted to bytes using UTF-8 and then hashed. Which characters get escaped and which ones don't

fc3b6a5b... 2y ago

https://github.com/nostr-protocol/nips/issues/354

3f152ab6... 2y ago

That is concerning.

3f152ab6... 2y ago

I wonder are there any emoji that can be encoded as different UTF8 depending on the implementations too.

I think the underlying problem is that utf8 mostly concerning about how non-ascii characters can be packed in a ascii text which is json. The encode can varies as long as the decoded can restore it correctly. Nostr can’t allow varies encode given a same character, otherwise the id will differ. 😔

3f152ab6... 2y ago

Found an example: https://stackoverflow.com/questions/50220973/two-different-eye-emojis