Either use an LLM with a big enough context window to fit the whole document. Or use RAG to find the related parts from the guidebook and passing that context to the LLM for each question.

Reply to this note

Please Login to reply.

Discussion

Thanks! Sounds not so easy yet.. 🀣

It kind of is a bit of both. Finding a large model isn’t too hard. I mean Gemini 1.5 Pro eats movie scripts and finds the right answer in there or in huge codebases.

I implemented a RAG LLM app that runs locally on my machine with a local model and it scans my source code directories and gives me answers about them. And it was less than 100 lines of Python.

But productionising it and making it run at scale is a whole other topic which I’m not too familiar with yet