anyone have a good idea on how to make hashes of phone numbers rainbow-table resistant? Trying to think of a way to find friends on nostr by phone number but without leaking the number. maybe argon2(first name + number) ?
We should spec this if so.
anyone have a good idea on how to make hashes of phone numbers rainbow-table resistant? Trying to think of a way to find friends on nostr by phone number but without leaking the number. maybe argon2(first name + number) ?
We should spec this if so.
How does vexl do it?
👀
GM Will hitting the ground running this morning I see, good on you! (I haven’t the foggiest idea how to help you with that but I’m glad you and all these other fine people are thinking about it already this morning.) Thanks for making Damus I use it everyday now and it’s exciting to know there will always be something to innovate!
I still feel like name + number is rainbow-tableable. Maybe a deterministic salt beacon (like hash of bitcoin block at height (N mod 4096 == 0) to resalt the hash every 4 weeks? but then old hashes could be cracked eventually 🤔 will ponder further
I think a problem with any one of these schemes based on public info is that a specific user could be brute forced. Will have to calculate how long it would take to do argon2(name + number) for all numbers with a known area code.
Double sha256 hash? That's what Satoshi would do
Too weak, easy to compute a large table 🐶🐾🫡
I'm starting to think phone numbers don't have enough entropy. Without another source, it's hard.
Argon2 is hard to generate with the modern hardware and CPU. What you could do is, to make it even harder, is generate each phone number hash with deterministic hash (0-16 prepended) and when looking up, generate all 16 hashes and then match against stored hashes. Not fool proof approach, since you literally need a rainbow table, but would take years for anyone determined to bruit force it. And storing such table would take a few TB of storage (not much but). One more way, that will make lookup a pain, is to salt it and then compute for each comparison, but then it will take ages in normal use.
Short of finding a compromise (spend 10 seconds to compute n hashes that will be used for lookup), I see no other way. 🐶🐾🤔
If i’m reading this right:
argon2(
argon2(
argon2(bob + first digit), bob + second digit) … ) ?
Argon2 per digit of the number appending hash in each iteration. Not sure how complex that will be, but all you need is to have it compute in 1-3 second on the modern phone 🐶🐾🤔
Something like that? 🐶🐾🤔
function hashPhoneNumber(phoneNumber):
hashedValue = ""
for each digit in phoneNumber:
if hashedValue is empty:
hashedValue = argon2Hash(digit)
else:
concatenated = hashedValue + digit
hashedValue = argon2Hash(concatenated)
return hashedValue
I like this. It would be crazy hard to build a table for this, maybe by setting a large iteration security parameter? I would just have an initial salt value just in case this approach gets standardized, but otherwise I’m leaning toward this. I don’t think you would even need the name to simplify it.
I think so! And we hardest iteration number you’ll make it close to impossible to table for the next 5-10 years, which is sufficient for this level of information 🐶🐾🫡
You all are so fucking smart it’s ridiculous
But as usual, we're solving problems from a technical perspective. 😞
What if the phone number just gets associated with an independent keypair, and instead of a "search" occurring, a sort of encrypted inbox is utilized?
Someone adds their phone number to nostr by having this new and unique keypair that is indexed by the phone number. When a user tries to find their friend, they receive the associated keypair. This user then deposits an encrypted payload identifying themselves, perhaps even including a phone number if it exists. This would allow many different UI flows, including contact matching.
The original user who added their number in this way can check on the status of their "encrypted inbox" by requesting new events to the inbox. Only the user controlling the private key would be capable of decrypting the inbox, so it likely wouldn't need to even verify they own the inbox.
Once the user sees that their friend is there (a suggested connection, or some way to vet users requesting access?), they just add them and a connection is made.
There would be some things to work out, like proving you own the number, and protecting anonymity of the person registering this encrypted inbox (built in, multi-hop, encrypted relaying?), but seems like this sort of flow could function.
Problem statement: you have a friend’s phone number. You want to find your friend’s npub in such a way that the number is never revealed to anyone who doesn’t already know it. That is phone->npub, but NOT npub->phone.
One approach is to treat it as a two-step process. The first step is to find candidate npubs who credibly could have the number. The second step is to challenge candidates to prove it.
Step 1. Find candidates. Suppose each npub publishes a Bloom filter which has been fed their phone number AND other random garbage. By fetching these filters, you can test them to see if any at least credibly claim to have the number.
Step 2. Challenge candidates. For each candidate, send a DM which has a random payload symmetrically encrypted with a key containing their phone number and a nonce. Tell them the nonce and encryption scheme in the DM. If they have the phone number, they can decrypt the random payload and return it as proof. If they don’t, they’ll have to search. But they can’t have pre-computed the rainbow table because your nonce was random, and also you control the encryption scheme.
If a candidate doesn’t respond in a timely manner, this could be because they didn’t receive the DM, or they’re trying to crack the number. So you’ll have to use some judgement, perhaps starting a DM conversation first, then issuing the challenge.
Note: Because the search space of phone numbers is so small, the Bloom filters must contain a bunch of random phone numbers as false positives to prevent an attacker from determining an npub’s number by brute force tests. A few tens of thousands ought to be enough.
There are probably holes in this approach, I just thought it up now in response to your query.
On further thought, the entries in the bloom filter probably also have to be salted and re-hashed. That way an attacker has to salt and hash every phone number and test it against the filter even to get the 10k bogus ones.
Otherwise, there’s little value in putting it in a bloom filter aside from space saving and computation. Salting and rehashing makes the attacker’s job harder.
Sending an encrypted phone number is unnecessary in Step 2. This is the interactive protocol you want: https://en.wikipedia.org/wiki/Socialist_millionaire_problem
perhaps un-adopt first name in favour of “shared secret” as a safer way to think about this?
If you need a shared secret you might as well ask your friend if they are on nostr. The point of this is finding all your friend npubs without having to ask them all individually
a cryptographic hash would make it only possible to know what hash to look for to find the number
you are simply going to need more entropy than that, that's the short answer
there might be some potential to use proximity hashes like used in AI neural networks to get a partial match based on the combination of pubkey and phone number, and thin down candidates, but simply, the phone number is too small to be resistant to a white pages index attack
nostr:npub1xtscya34g58tk0z605fvr788k263gsu6cy9x0mhnm87echrgufzsevkk5s https://pypi.org/project/proximityhash/ geohashes are one form of proximity hash scheme, there probably is others
i think if you look closely at this you may find a way whereby you can get a close match that massively thins down from a set of publicly visible hashes that are partially generated from phone numbers, such that they don't facilitate reverse indexing but enable those with the number to find them
there is probably a heap of papers you can find and maybe someone has already posed the question of how to make a forward matching phone number search resistant to reverse matching
it's a problem that is very old, i remember in the old days when there started to be reverse phone number databases, which is the problem you are trying to solve, so probably someone has already got a scheme that is data-type-specific
I was trying to google this, I would be surprised if this hasn’t been thought of before.
Yeah, first thought is to add pubkey.
Is the idea that the phone number would be posted to relays with your profile?
yes, well a special hash of your number that is very hard to reverse
If you ARE NOT adding a random salt, you get the rainbow table problem. If you ARE adding a random salt, you can't match the result up with others for a lookup that will be using a different random salt.
I randomly add pink Himalayan to everything
Me likey dis.
Note that this doesn't protect against impersonation. I can just compute the argon2 of "Alice" + "555-4939" and publish it in my profile.
And then you'll contact me and I'll pretend I'm your friend Alice and hilarity ensues or something more sinister.
Let's not encourage the use of phone numbers, period
It's time to move away
Doesn’t have to be numbers, thats just the common thing in phone contact lists
Accumulators might help here
I had this thought! But didn’t see anyone doing this with accumulators 🤔
i think accumulators combined with web of trust might be the magic you are looking for
if you know a few people in common and there is a good chance they are linked one or two steps from this other person it acts as a big set of extra set members to get a result that approaches true/false
what that would probably mean in practise is you would fill in several numbers in one search and when the keys of several hit then it narrows down the identities to a small set, and uses web of trust traversal following the memberships
i'm sure you are gonna bake your noodle trying to figure out a credibly secure method to do this on an async data system like nostr but i just want to keep encouraging you to look into it
i'm sure i'm not the only person with a passing curiosity about such things... cryptography applications are nowhere near a complete set of principles as you see in many other sciences, there is a heap of new combinations of tools that haven't happened yet, combinatorial possibilities in cryptography, so, LFG!
Too much risk for less profit. I would never reveal my phone to a nostr client. Phone's are registered by governments.
this is why cryptography exists, to do things like this without having to reveal your number to anyone. If you are paranoid about it you shouldn’t use a sim card. Also this doesn’t have to be a phone number, it could be any piece of data shared among friends. Phone # is just the obvious one that is in contact lists.
Sure for cryptography, but still I have to trust the piece of software that creates the hash.
Yes, there is something called “salt” in hashing. You add a value (salt), to the end of the phone number and when you hash it with the salt it is completely different than the phone number without the salt. This makes it extremely resilient to rainbow lists. I hope that is helpful.
Our code is free and fully open source