Nostr Web Client

I tested RAG but I'm getting poor results, I played a little with chunk size and chunk overlap, but it doesn't seem to help. I only got decent results (but no better than the standard query) with Open WebUI enabling "Full Context Mode" (so the whole document is fed), but it took 30% more time to reply compared to the standard mode.

Any suggestions?

Gzuuus 5mo ago

Hey! Yes, you could follow what nostr:npub12262qa4uhw7u8gdwlgmntqtv7aye8vdcmvszkqwgs0zchel6mz7s6cgrkj is recommending. Basically, create a synthetic dataset with two columns: col 1 for questions and col 2 for answers. You can use an LLM to generate this dataset, then embed the answers. I would also recommend using an embedding model like Nomic ( https://ollama.com/library/nomic-embed-text ) since they have an interesting prefix system that generally improves the performance and accuracy of queries ( https://huggingface.co/nomic-ai/nomic-embed-text-v1.5#usage ). I can also share the code for the Beating Heart, a RAG system that ingests MD documents, chunks them semantically, and then embeds them https://github.com/gzuuus/beating-heart-nostr . Additionally, I find the videos by Matt Williams very instructive https://www.youtube.com/watch?v=76EIC_RaDNw . To finalize, I would say that generating a synthetic dataset is not necessarily needed if you embed the data smartly.

Reply to this note

Please Login to reply.

Discussion

daniele 5mo ago

Lots of things to study, I will take a look and experiment!