Should I go with https://github.com/ollama/ollama trying to import https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B?

There are too many models to pick one at first. I guess I have to start simple.

Where are the local LLM experts here? #askNostr

nostr:nevent1qqs82a0pzaz7elnhctt98wu5sqkt2pcd5da7zqw9c46ew7483ggpqpczyqrx8x3cdjwpq9ppwc3ve085pyyvfudqcvlz87xk668540m9t78hzqcyqqqqqqgpz3mhxue69uhhyetvv9ujuerpd46hxtnfdug9k2qx

Reply to this note

Please Login to reply.

Discussion

Try a bunch of them. I like Gemma 3

ollama or llama.cpp, see: https://redlib.catsarch.com/r/OrangePI/s/TAEvKJAK4d

Don't know if it'll use the NPUs ootb, possibly not. That thread might provide some crumbs to follow up. Have no experience with that hardware myself. Curious, keep us posted.

I found this for using the NPU

https://blog.tomeuvizoso.net/2025/05/rockchip-npu-update-5-progress-on.html

but is seems very hacky and I guess the road the get it to work will be bumpy