Nostr Web Client

Then you know you can just add trackers to the many parts of the code in order to store who is creating Queues and how and when both Bob and Alice connect to the same Queue, map to their IPs and capture their rough locations. Over time the amount of data allows you to de-noise any IP re-use and specify which locations talk to which other locations. By roughly knowing locations over time, you can pin-point an exact user (for instance, if you roughly know where I work and roughly know where I live, you can easily filter your data set to get my exact queues). Knowing an exact user and some rough knowledge of my contacts (say through Nostr or other spaces), it allows you to start mapping out other users in the network. Soon enough, you can safely identify the owner of every single queue.

It's work, but the metadata is there.

Clients can make that work harder by using Tor and other tricks but most people are not doing that today.

Now picture a similar server but without the creation of queues and each time Bob and Alice connect to send a message, they use a separate random key you have no knowledge about. You know IP X is receiving stuff using pubkey A, but the stuff is coming from everywhere. Everywhere can be another client or a relay that is just re-broadcasting the message, which can be old or new. You don't know if it is the same conversation or not. You can track IPs of each sender and try to group messages by pairs of IPs, but when the IP changes, there is nothing you can use to link the two IP pairs used by the same person. I will look like a new user. Because you have less information, you will need to rely on more external data to identify each conversation thread, which creates more uncertainty on the outcomes.

Now add Tor on top of that.

SimpleX Chat 1y ago

nostr:npub1gcxzte5zlkncx26j68ez60fzkvtkm9e0vrwdcvsjakxf9mu9qewqlfnj5z Thanks for the suggestions. We did indeed consider various ideas about how to reduce the persistence queue ID, but the to rotate the queues to another server periodically seemed simpler and providing better metadata protection. The current approach also allows some basic protection from resource exhaustion attacks.

What I don't quite follow - if the same key A is used to retrieve the messages how it is any better than having queue ID from the perspective of correlating messages to the users? Wouldn't it still identify the user any better, and now instead of mapping IP addresses to queues, the server could map IP addresses to the messages... Maybe I don't understand what you propose?

The ideas we considered were about trying to avoid any persistent IDs entirely, e.g. via some kind of DHT tables, and something like this might happen in "v3" of the protocols (we are currently still moving to "v2").

Reply to this note

Please Login to reply.

Discussion

Vitor Pamplona 1y ago

The difference is that each client can create as many rotating keys to receive the message as needed, including a new one per message if it is really necessary. These are created by the client, the server doesnt know about them.

But the main component is that you can encrypt the already encrypted message to a separate pubkey as many times as you need. You can basically create an onion route using multiple queue IDs where every server only knows where the which node is coming from and going to but not the author and the receiver. Messages can use complely separate routes, making it impossible to track a conversation sequence.

On top of that, dates and times are random. Servers don't know which messages are new and which are not. Many payloads might contain the same messages, adding noise to the network.

All of that is done with a single message type: A GiftWrap event.