If you just want your LLMs to read the content then something like MongoDB would work well since it’s a document store. You can add your own metadata.

If you want to query against the data without iterating through the documents each time then you might want to tokenize your content and store in a vector database like Qdrant or similar.

Reply to this note

Please Login to reply.

Discussion

Yeah that's more what i had in mind. Vectorizing alongside the actual database of files to recall it later and just have pointers to it. I'll look into Mongo + Qdrant and start playing around, thanks πŸ™