Nostr

⚡📰 [The Math Behind SN's Web of Trust](https://stacker.news/items/916035/r/botlab)

SN's trust algorithm is based onTrustRank, which is itself a modifiedPageRankalgorithm. The purpose of TrustRank is to calculate overall trust scores for a vector of nodes (i.e. users).

The TrustRank algorithm requires two objects as inputs: thetrust graphM, and theseed vector\mathbf{v_0}. It returns a vector of trust scores as output,\mathbf{v}.

The trust graph,M, is a matrix in which each elementm_{ij}describes how much nodeitrusts nodej. The rows ofMadd up to 1, so it's as if each nodeihas one unit of trust to distribute among the other nodes.

The seed vector,\mathbf{v_0}, describes a baseline level of initial trust for each node. Typically, the seed trust for most nodes is set to zero, but a handful of known, trustworthy nodes are seeded with a trust of 1.

The seed vector influences the final trust scores by pulling the scores closer to itself. The strength of the seed vector's influence is determined by a parameter,\alpha, called theseed weight.

Given the trust graph,M, the seed vector\mathbf{v_0}, and the seed weight\alpha, the final trust scores are given by the vector\mathbf{v}that solves:

In SN's case, the nodes represent users, the trust graph represents the users' degree of similarity in preferences, and the seed vector representsa-prioritrusted users--mainly@k00b,@ek, and territory founders (when calculating trust inside a territory.)

The TrustRank algorithm is best understood as a diffusion process. Imagine the nodes to be three countries, and the trust scores to be the stable distribution of population in the three countries.

The algorithm describes the following population diffusion process:

Let\mathbf{v_t}be the population distribution at timet. Based on this diffusion process, the population distribution at timet+1is:

Repeated iteration of this diffusion process results in a stable distribution of population across countries. This stable distribution\mathbf{v}satisfies:

This is the solution to the TrustRank algorithm. Thus, one can think of the TrustRank algorithm as giving us the stable distribution of trusts when trust is allowed to repeatedly diffuse across nodes according to the trust graph and seed vector.

Let the initial population of countries be\mathbf{v_1} = (0.3, 0.3, 0.4)and let the seed vector be\mathbf{v_0} = (1, 0, 0)(thus, when people teleport they always teleport to country 1). Let the seed weight be\alpha = 0.85.

Finally, letMbe:

This says that people in country 1, when getting a chance to migrate, will choose country 2 with 50% probability and country 3 with 50% probability. People in country 2 are 50/50 with countries 1 and 3. And people in country 3, when migrating, always move to country 2.

After the first iteration:

After the second iteration:

If you kept going, or if you solved for the exact solution, you would get the stable population distribution:

SN does the trust calculations separately for each territory. For territories with at least 50 users who have ever zapped, downzapped, or posted, the seed vector is 1 for the territory founder and 0 for everyone else. For territories with less than 50 users who ever zapped, downzapped, or posted, the seed vector is 1 for@k00b,@ek, and the territory founder, and 0 for everyone else.

The seed weight is always\alpha = 0.83.

This is the most complicated part to explain. The trust graph is calculated on a per-territory basis. It is also calculated separately for posts and for comments.

To calculate the trust graph, first define the following metrics for each useriand itemk:

Now define the following second order metrics for usersiandj:

before_{ij}is a measure of the correlation betweeniandj's zaps for items in which userjzappedbeforeuseri. It is used as measure for how muchitrustsj.

after_{ij}is a measure of the correlation betweeniandj's zaps for items in which userjzappedafteruseri. It is removed from the denominator for trust calculations.

disagree_{ij}is a measure of how many itemsiandjhad disagreements on (one up-zapped while the other down-zapped).

total_{j}is simply the total number of times userjzapped any items.

Roughly speaking, the idea is to measure the trust that userihas for userjas:

That is, out of all the times userivoted on items after userj, in what fraction did they agree vs. disagree?1

The actual calculation is a fair bit more complicated. Trust fromitojis actually calculated using the equation below, then normalized to sum up to 1 row-wise:

Here,H(x,y,z)is the lower bound of thebinomial proportion confidence interval, for an experiment withytrials andxsuccesses, for a confidence level given by the Gaussian z-scorez. Thezscore is chosen for a confidence level of 99.9999999%.

To explain why the lower bound of a binomial proportion confidence interval is used, we can think of each item in which userizapped after userjas a trial: did useridisagree or agree with userj's zaps? The estimated probability that they agree is simply the number of agrees divided by the number of such items.

However, if the number of items is small, we can't be confident that useriand userjare actually similar in preferences. For example, if userionly zapped two items after userj, and both are agreements, then the number of trials is 2 and the number of successes is 2. The best estimate of the similarity betweeniandjis 100%, but because the sample size is so small, we can't actually be very confident of that estimate. The lower bound of the confidence interval would be much lower than 100%. Therefore we use that lower bound to calculate trust, in order to shade conservative in the trust scores.

The above presentation is greatly simplified. There are many technical details that are glossed over, like thresholds for which zaps and downzaps are included in the calculation, and nuances in the code for how before and after are calculated. Moreover, calculating the exact TrustRank solution by matrix inversion is computationally expensive, so an iterative approximate solution is used instead. For the actual code, one can take a look atworker/trust.jsin theSN Github Repo.

There is no straightforward way to boost your own trust score. This is because other users' trust of you is calculated based on their behaviorafteryou already zapped an item. Thus, zapping items that other people already zapped doesn't directly boost your own trust score, it actually increases your trust inthem.

The best way to increase your own trust score is to zap content early that you think other users will also zap in the future. Zapping content that you think@k00b,@ek, and territory founders will like is especially helpful, since these users are seeded with higher trust--though@k00band@ek's influence only extends to the less active territories. In more active territories, only the territory founder is seeded with initial trust. So, if you want@Undisciplinedto trust you, consider zapping libertarian opinions early in the~econterritory!

Now that you have a deeper understanding of how SN's Web of Trust works, calibrate your behavior accordingly and build that trust! Happy stacking and happy zapping.

p.s.@k00b,@ek, let me know if I got anything wrong. I'll also follow up with more technical feedback in a GitHub issue.

This is an over-simplification. In practice, it's not just the number of agreements that's counted, but the number of agreements weighted by the ratio of the zap amounts. The number of disagreements is also subtracted, which could result in a negative numerator. The denominator also cannot be strictly interpreted as the number of times userivoted after userj, both due to the weighting of the agreements by zap ratio, but also due to nuances in the code about how before/after is calculated.↩

By @SimpleStacker (24000 sats, 11 zappers) | [Stacker News](https://stacker.news/items/916035/r/botlab)

Reply to this note

Please Login to reply.

Discussion

No replies yet.