Wikipedia on Nostr? What might that look like?

Right of the bat we’d have to think about spam. To combat spam we’d probably need to introduce proof of work. And it can’t be too easy to complete.

To allow for a wide variety of perspectives we’d probably want to show the end user various different “views” of the same page, based on user-specified criteria. Much like algo choice, the user would need to have control over how many opinions and which they’d like to consider while viewing that page.

For example, a page about CCP might be heavily edited by CCP and it is entirely possible that the page would be manipulated via PoW, likes, zaps, whatever… So what is the end user to do? They would have to be able to use the social graph to determine how many and which opinions they value most. But even that comes with a bunch of challenges and possible manipulation.

I imagine Atlas, sorry Pablo haha, would resemble version controlled UI where you can quickly broaden or narrow the scope of opinions you wish to consider which then updates the document version based on how many people in you’re criteria edited it or found it valuable.

Challenge 1: Most people will not interact with the millions upon millions of pages so there may be weak signal from your personal or even extended social graph. How do we keep the pages balanced in perspective? Do we provide geofenced views? “I don’t care what China has to say about itself” What if they just use US IPs?

Challenge 2: Even if you were able to source enough social graph signal, that signal can also be manipulated by someone who is keen to manipulate via likes, zaps, or PoW.

Finding signal with NIP5

One thing we could consider is designing the UI in such a way that it specifies which entity a user is related to via NIP5.

For example, it is not inconceivable that lists could be curated that include all known government and university websites. Then, anyone with a NIP5 from any of those domains would be classified as a “government official” or “associated with X university”. Combined with the information about who made the last change you’d have a clearer picture of the potential biases involved.

Of course, this doesn’t mean only users from govts or universities provide signal (govt domain is mostly to monitor potential abuse), so we’d need a UX that helps jump between various sources of edits including the plebs. The UI has to be obvious enough and easy enough to navigate that you can see the page edits from these various sources without clicking too much.

The question that remains: how do you know which version to present the user if they are constantly edited by hundreds or thousands of people near real time?

Perhaps the answer is: You don’t.

What if we use DVMs as a means of querying the information to find what we consider to be most fair and bias-free.

For this, we could provide some pre-built prompts to show results that exclude govt domain edits, or exclude university domain edits, or exclude Bulk geo-fenced populations. This would probably require a smarter relay though… have to think that one through. But the idea is that you could see which country is making the edits and offer a DVM that excludes the country which may be biased in their edits. The pros of this approach is that you can eliminate some of the manual IP manipulation by highly interested actors.

For the final UI, I imagine a topic page that provides the following:

- Topic name
- Clearly visible which types of entities have edited it and how many times
- A query window like any of the AI chat models.

- Perhaps some “most popular” edited versions view in an easy to hover navigation) see below for what I mean by “most popular”

- Pre-created prompts to help filter without much thinking (exclude govt. edits, show university edits, exclude countries etc…)

- A highly selectable set of filters that are easy to check and apply - which then triggers a very fast DVM query.

- Name and bio of editor returned by the query

- Comments on the most commented version (perhaps also filtered by NIP5 criteria (also specifiable by end user))

We know we can’t rank anything by zaps as that is easily gamed. We also can’t fully rely on likes because it’s trivial to create keys in an automated fashion. A state actor can easily influence all of these metrics. (This is where filtering by NIP5 may actually help).

These are just some starting thoughts. I think the end result would give a lot of querying options to the end user and allow them to decide for themselves which versions of the “truth” they believe in. We can show them many versions, and perhaps find some ways to curate the top versions - whatever that means or whatever that looks like in the end, but ultimately the user can decide which sources to trust or ignore.

Reply to this note

Please Login to reply.

Discussion

Thanks for your thoughts! Sounds challenging to avoid both spam and censorship and we’ve already seen how much nostr:npub12vkcxr0luzwp8e673v29eqjhrr7p9vqq8asav85swaepclllj09sylpugg has wrestled with these challenges in something with much less complexity (IMO) with the #nostr feed.

Nostr + Wikipedia = Wikipedia with Boobs 😎

Another thing we could consider is introducing “trust scores” to get a high level view of the trustworthiness of the editor who’s edits are being shown.

This may sound shady, but we could in theory devise a score that looks at the overall attributes of the user to try to understand if they have good intentions or not. The more factors you consider in a trust score, the better. What might that include?

- Date of npub creation (older is better. Of course, this can be manipulated with “aged npubs” too, but it’s a factor nonetheless to discourage obvious spammers)

- Number of high quality interactions (notes with reactions or zaps)

- Number of zaps on profile

- Total quantity of zaps

- Their NIP5

- Possibly reviews of the user (think community notes on the npub)

- Number of times they were muted

- Posting IP location (just the region not the actual IP)

- Frequency of notes (are they active or mostly dormant?)

- Frequency of reactions (do they interact with others?)

The more criteria we can think of, the harder to manipulate.

This can then all be tallied into a numbered score (maybe %-wise) and color-coded for quick reference to see if this may be a bad actor or someone with a solid history on nostr (whatever this ends up meaning).

so Wikipedia with anonymous reputations?

Are we at the point of just because can, doesn't mean should? 😉

I’m not sure what you mean

Micro knowledge bases on different relays. Users connects to relays that they believe provides signal, client aggregates notes to a single network.

Now for writing. Maybe ask the relay operators for permission to write? Do they know you? Any other type of payment/service model as a barrier of entry could also work. This alone will reduce spam significantly. I dont really think much work needs to be done on that end other than a little bit more usability in terms of relay management and discovery. Sure a knowledge base on Damus where everything goes can exist, but the signal is likely to be low.

If a user can't convince anyone that their writing is worth it to be on an exclusive relay, maybe thats a good thing. Regardless, they can still put up their own relay for others to read and write to.

Organization of content:

Parametric replaceable notes and modular articles composed out of smaller notes.

At a minimum, think about kind 0(metadata) and kind 1(text notes)

Now make an analogy to articles:

Article Header and article notes (kind 30040 and 30041 in my client indextr)

Article header lists the metadata of the article- title, authors, etc. But also the list of event id's that compose the article.

Article notes: The textual content of the article divided into sections (or even paragraphs, sentences it really doesn't matter). A tag that dictates the functionality of the note (flashcard, article, Jupyter Notebook, recipe. etc) the client can receive pull requests on how to display specific notes (like a Svelte component or a CSS file)

Replacement because knowledge is always updating. So if the author wants to update it they can, but the version history will remain for the article headers and notes.

Modular because you can compose new articles from existing notes. The list in the article header can also contain other article headers, or any other type of note.

Now, an article is not just linear, but it can be branching or whatever other weird structure.

Modularity of articles also creates heterogeneous articles. Take a YouTube tutorial about building an LLM. Take the transcript, split it into sections. For every section now you can link the documentation of the library you are using (as another note) with a jupyter notebooks in each section.

Library documentation: every function is a note with as many various examples as you want. Interoperability also allows for commenting on those notes - now its documentation with stack overflow. Interoperability also means individuals can comment on articles or specific sections.

Still very early, but thats's the basic outline.

https://github.com/limina1/indextr-client/tree/main

You opened my mind to how challenging this project would be.

Aren’t these problems already addressed robustly — nay, solved? — by Wikipedia?

Good points.

Some thoughts:

I'd like to be able to see and search all the edits.

Upvotes and downvotes are cheap and astroturfable. But still can give you some info in non-controversial topics.

Would it be possible to use proof of work somehow to "upvote" the edits with real world resource expenditure? Astroturfable too, but costly.

Not sure on last point but any topic if controversial enough could be manipulated if only one criteria determines ranking. That’s why I think it’s best to start off with a big criteria set and let users add or remove from it.

Ultimately you don’t want rankings but you could have an option for simplified ranking if you didn’t care too much.

Very interesting thoughts! My two cents:

1) Nostr should embrace a permissionless version of wikipedia, rather than copy its current model where people fight over editing a single page on a given topic. In #nostrpedia, anyone should be able to create their own version of any page. However, it may not gather any support. Maybe more like github than wikipedia.

2) Since liking and zapping can be gamed, one idea is to use the burning of sats as a metric, i.e. how many sats have people burned/sacrificed to the page - this can't be gamed without cost. I'm sure there are Web of Trust approaches to this as well, but at the moment I don't know enough about it.