Nostr Web Client

There are lots of topics (millions?) in the replies. Most people want a mix between very general things like "science" when they don't know much about it all they way to specific things like "sha256" when they are in that field. But we can make it work with whatever the bot outputs.

Maybe the bot can add the 5 most representative labels for each post?

Low Information Voter 1y ago

Offering many very specific labels - a vector database like Weaviate or FAISS coupled with an LLM can do it and do it well, but man... We are talking datacenter level of resources here.

Not many organisations can offer that, and I'm not sure I want them curating my feed.

Offering a couple of dozen general categories though, that we could run in the client or on a modest VPS, with maybe a BM25 fulltext search for specific terms (will be slow).

Reply to this note

Please Login to reply.

Discussion

Low Information Voter 1y ago

General categories we can do with a BoW filter fed forward into a modest CNN.

Specific categories need a serious LLM and vector database of context, or else accuracy will be hilarious

Vitor Pamplona 1y ago

We can run multiple bots, each using a different pubkey. People will decide to follow whatever works best for them. We could have multiple algorithms running in parallel.

Low Information Voter 1y ago

We could. There will be a limited number of actors able to finance such a service, however.

If I may make a suggestion, we could run the model in the client, using notes already downloaded.

Default topic model downloaded on first run, or be bundled with app (~50 MB).

Menu of topic models somewhere in settings:

- L.I.V's Mad Science Topic Model

- Leserin's Overthinking Everyday Topic Model

- Onyx's I Know What Boys Like topic model

Etc.

Building a model is less of a commitment than hosting one, and the processing is offloaded to the client instead of angel funding or whatever

Vitor Pamplona 1y ago

You can also put the labels behind a private relay only paying customers can access.

Low Information Voter 1y ago

That works, too