Nostr Web Client

david 11mo ago 💬 29

PageRank: who’s the most popular

GrapeRank: who’s the most qualified in a given context

These are not the same.

Reply to this note

Please Login to reply.

Discussion

protocolSociety 11mo ago

NoteRank: Which notes have the most popular reply tree. (I invented this years ago… no-one else seems to have thought about it… but now You do!)

franzap 11mo ago

Pagerank inputs are public follows/mutes, what is the input for GrapeRank?

david 11mo ago

GrapeRank uses something called "interpretation" to translate whatever input you have access to & find valuable into a standardized format that is ready for digestion by the GrapeRank algo. GrapeRank is also contextual. So far, I have implemented only one GrapeRank context: verified real nostr user. And yes, the input is follows and mutes. Other GrapeRank contexts will take other inputs.

But when I say PageRank is about popularity, I'm not referring to the input. I'm referring to the score itself. It's designed to be a measure of popularity. The more followers you attract, the higher your PageRank score gets. It incentivizes you to become an Influencoor.

The GrapeRank "real nostr user" score is not a popularity contest because your score does not continue to grow unbounded. Instead, it levels out at unity. If you have 0 followers, your score is 0. And then it increases with each follower, depending on that follower's GrapeRank score. If you get 50 high quality followers, your score might be something like 0.95. If you boost your follower count to 500, your GrapeRank score may be something like 0.995. For this particular GrapeRank context, your score never goes above 1.

Now, I'm not saying we shouldn't use PageRank. It's great for filtering out spam and it's (relatively) easy to calculate. It's well-known and there are lots of relevant tools out there. I use neo4j to calculate PageRank as one of three WoT scores currently available on my site. But it's not the not the final word in graph-based recommendation engines. It's barely the first word. A popularity score makes sense for Google, bc influencers drive traffic and increase ad revenue. For freedom tech, we need to think deeply and differently about how and why we use these algos.

franzap 11mo ago

I asked about inputs because that defines if it's usable or not.

What kind of interpretation is performed, and how exactly is that fed to the algo?

And also what is the goal of Graperank? Personalized Pagerank gives you a very good idea of the trusted people. You can then apply context, at least we are not suggesting it as an automatic curation tool.

david 11mo ago

Interpretation means take whatever data is available to you and translate it into a ratings format that is ready to be digested by the GrapeRank algo.

For example: if Alice follows Bob, I INTERPRET the follow AS IF she had issued a rating in the GrapeRank format. Which she didn’t, of course, but that’s why I call it “interpretation.”

The format requires 5 fields:

- context (string)

- rater (string)

- ratee (string)

- rating (number)

- confidence (number between 0 and 1)

At my site right now, every follow and every mute is INTERPRETED as a rating, issued by one pubkey to another pubkey. The context is something like: Real Nostr User. The rating field is a 1 or 0 for follow or mute, respectively. The confidence is 0.03 or 0.5 for a follow or a mute, respectively.

The final GR Real Nostr User influence score is a number that is suitable to be a weight in a weighted average (eg, to calculate ratings at Yelpstr if such an app were to exist). It is a number between 0 and 1, where 1 means “verified Real Nostr User.”

If we were to use PageRank to calculate average scores of businesses at Yelpstr, the opinions of the K Kardashians of the world would dominate. If we were to use the GrapeRank “Real Nostr User” score, the opinions of K Kardashian and A Einstein would carry roughly equal weight.

franzap 11mo ago

To your last point, not if we used Personalized Pagerank.

So PP in your model would be one application of GR.

In practical terms how do you foresee users choosing their contexts (free form or taxonomy?), ratings and confidence levels? Or will this mostly rely on interpretation?

Pip the WoT guy 11mo ago

> But when I say PageRank is about popularity, I'm not referring to the input. I'm referring to the score itself. It's designed to be a measure of popularity. The more followers you attract, the higher your PageRank score gets.

I disagree. If you swap follows for zaps, then it's about who gets the most sats, which are more closely related to subjective value (if we gloss over the complications of having weighted relationships and self zaps...)

> The GrapeRank "real nostr user" score is not a popularity contest because your score does not continue to grow unbounded.

I don't get this point at all. The same argument applies for pagerank, where all scores are bounded (at least by 1.0).

david 11mo ago

They’re bounded after you normalize it, sure. That doesn’t change the fact that it’s effectively a popularity contest. It’s the relative scores that matter, and ratios don’t change with normalization.

Suppose A. Einstein is a pleb with 20 or 50 or 100 high quality followers. Enough to prove his profile is probably not a bot. K. Kardashian is an influencer whose follower list is 10k or 100k or 1M and continues to grow by leaps and bounds.

What is the ratio of Kardashian’s PageRank score divided by Einstein’s PageRank score? It’s really high, that’s what it is.

PageRank is a great measure of popularity, which is a valid thing to measure and has its uses, but a poor measure of contextual merit or qualification.

Pip the WoT guy 11mo ago

If we are using follows as the relationships on this graph, then yeah GLOBAL pagerank is correlated to followers count. NOT Personalized pagerank.

And, Graphrank uses follows too, so I don't see the improvement

david 11mo ago

My critique of PageRank is not that it uses follows, as I stated a few posts earlier in this thread.

The ratio of K Kardashian’s PageRank score divided by A Einstein’s PageRank score (the example from my prior post) is very high. I suspect it’s unbounded as Einstein’s follower count remains fixed and Kim’s score continues to grow. This is true of global as well as personalized PR.

Pip the WoT guy 11mo ago

> The ratio of K Kardashian’s PageRank score divided by A Einstein’s PageRank score (the example from my prior post) is very high. I suspect it’s unbounded as Einstein’s follower count remains fixed and Kim’s score continues to grow. This is true of global as well as personalized PR.

I am really not getting what you are saying. The ratio of their two GrapeRanks will be unbounded too.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

it is bounded by the number of nodes in the network, but yes

Pip the WoT guy 11mo ago

not really. The maximum rank for both graperank and pagerank score is asyntotically 1, and the min is 0. So asyntotically 1/0

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

changing infinity to one doesn't change the semantics

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

"asymptotic" means "up to a limit" so infinity and one are the same, both are limits, just sayin

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

just thinking about it, infinity is and exactly is a limit

you can set other things to be limits but infinity is a limit only

david 11mo ago

Given the stated assumptions about numbers of followers, if we’re comparing baseline personalize GR and personalized PR scores (not using a contextualized GR score):

Personalized PageRank tells you that Kardashian’s opinion is substantially more valuable than Einstein’s because she’s an influencooor; GrapeRank tells you they are roughly equal because they are both real nostr users as opposed to bots. GR will give MARGINALLY more weight to KK, but this is only because AE’s limited follower count means we are MARGINALLY more confident that she’s not a spambot, compared to AE.

Hopefully that makes sense.

Pip the WoT guy 11mo ago

okay, now I understand what you are saying. Still, you can get the same result using a threshold function on PP.

if pp[node] > threshold

then set it to 1.0

or using a logistic function.

That's why the scores aren't that important, what is important is the order. Scores can be rescaled arbitrarily without changing the order.

david 11mo ago

Yes, you could do personalized PageRank, set an arbitrary cutoff, weight everyone above the cutoff equally, and ignore everyone below the cutoff. Crude but easy.

Here’s what I want us to be thinking about: using scores as ranking vs using scores as a weight. These are two very different purposes. Ranking means placing items in order. Using scores as a weight means you’re tallying votes or calculating weighted averages and you want to decide how much weight to give to each pubkey.

How much weight you want to give to each pubkey will depend on the context. Sometimes, I may want to give each “verified real nostr user” a weight of 1, and a weight of 0 to everyone else. In this case you could calculate pPR (personalized PageRank), select and arbitrary cutoff as discussed above. A step function, in other words. Of course the problem here is that the cutoff is arbitrary. What do you do with the 100k pubkeys who have a handful of high quality followers? They’re *probably* not bots, but you’re not 100 pct sure? They’re in-between. GrapeRank allows you to scale up the weight as your level of confidence increases that the pubkey in question is not a bot. You could do that with PageRank except replace the step function with some sort of curve. In which case, you’re halfway to reinventing GrapeRank.

What about the more complicated case where you want a context specific weight which is proportional to someone’s skill? Not set to unity for each normal user? Not clear to me how pPR could accomplish this. Unless you modify it and probably end up inventing GrapeRank.

I should reiterate: my point is not to say that PageRank is worthless. I’ve said many times I’m excited to see it being put to use, like at nostr:npub10r8xl2njyepcw2zwv3a6dyufj4e4ajx86hz6v4ehu4gnpupxxp7stjt2p8. I think it will add value to the nostr ecosystem.

I’m ALSO looking beyond PageRank, bc I believe we need to optimize it for freedom technology, not just for Google. Measures of popularity are useful but sometimes we’d prefer to measure quality and merit, not popularity.

Pip the WoT guy 11mo ago

> How much weight you want to give to each pubkey will depend on the context.

Sure, but then we go back to the data problem. What data do u use? If follows and mutes, you are already narrowing down the context to something loosely related to popularity.

> What about the more complicated case where you want a context specific weight which is proportional to someone’s skill?

I don't see grapevine solving this anytime soon, as I don't see anyone producing high-quality low-ambiguity data about that. Maybe a school can start posting skills-attestations for its students. Maybe. Now no one is doing it, the UX is terrible, and if only a few do it then the rating is completely skewed because there is not enough data.

david 11mo ago

GrapeRank is designed to use whatever data is available, for whatever context you wish to calculate. You can use multiple data sources simultaneously for one context. If the data quality is low, you can weight it accordingly, bc GrapeRank explicitly keeps track of confidence which is a factor used in calculating weights.

Low quality, highly ambiguous doesn’t mean useless. For many potential contexts, high quality data may be sparse, but low quality data is abundant. I’d put follows and mutes in the “low quality but abundant” category.

For some contexts, high quality, unambiguous does exist. For other contexts, people would be willing to issue high quality data if devs built the tools to do so and if the data could be put to good use. GrapeRank is one such good use.

High quality, low ambiguity can be useful even if it’s not highly abundant. I don’t need every pleb to issue an opinion on topic X if I’m only going to listen to the experts and if there’s only a small handful of experts.

Suppose I want my grapevine to curate a list of NIPs. I might want one list curated by users and another list curated by nostr devs. GrapeRank can (in principle - I haven’t yet coded this up) curate a list of nostr devs right now using NIP-51 lists, filtered by Real Nostr Users. Is there currently a way for devs to indicate their preferences or their approval of NIPs? I don’t know — but if there were, GrapeRank could synthesize their preferences into a curated list, which would be useful even if we only had a handful of devs contributing to it. And if people started to rely on this system, it would incentivize more devs to use it to contribute their preferences, thus increasing the utility of the list. Positive feedback loop.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

the value of authenticity is very high, so it seems like you agree it has a value to compute it

Pip the WoT guy 11mo ago

infinity is the cousin of zero, not one

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

ok but is it not true that infinity is the ultimate limit, you can't limit any less than infinity, but it is also a limit because you can't practically count it?

Pip the WoT guy 11mo ago

oh boy, this is a nice rabbit hole.

There are countable infinities, means the same cardinality as the natural numbers. Then there are strictly greater infinities, like the real numbers (pi, e, ...).

Aaaaand, given an infinite set, you can always make one that's bigger O.o

https://en.wikipedia.org/wiki/Cantor's_theorem

Chris Liss 11mo ago

been deep down it!

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

yeah, i have seen this stuff but "fin" means "end" and if you can count it then it has an end. the end.

Pip the WoT guy 11mo ago

yes I agree that the name is bad. An alternative is "listable". A set is listable if you can enumerate all it's elements in a list (which can be infinite). You can do it for the real numbers, as shown with Cantor's diagonal argument

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 11mo ago

yeah, so not real numbers, but integers, and all the various kinds of arithmetic groups

i remember, but not what, some insight a while back i had about how numbers have ways of "unfolding" something inside them quite a bit... i believe that this is what happens with discrete cosine transforms and cryptography too, there is almost always many other things that unpack out of a single integer, which does lead to an interesting other form of infinity, i believe that is what Cantor's theorem relates to - probably also hamming codes have this property being used in the opposite direction, also