What program do you use? I use ollama but it doesnt allow use is models from hugging face without modification which I haven't done yet.

Reply to this note

Please Login to reply.

Discussion

Ah but it does! Once you download the gguf file from Hugging Face, you can use ollama’s create command, passing in a Modelfile that specifies the path the the gguf. Then you can use ollama run to start up the model.

It’s kinda annoying but there are instructions online: https://www.markhneedham.com/blog/2023/10/18/ollama-hugging-face-gguf-models/

I used this technique to run mradermacher/dolphin-2.9.2-mixtral-8x22b-GGUF: https://huggingface.co/mradermacher/dolphin-2.9.2-mixtral-8x22b-GGUF

Is a gguf the big model file that ends with .safetensors? Sorry I am new to this

No, sorry, that’s the file extension. For example, this page has some large *.gguf files split into parts (because Hugging Face has a max upload size of 50GB): https://huggingface.co/mradermacher/dolphin-2.9.2-mixtral-8x22b-GGUF

Once you download the two parts, you can combine them into the single *.gguf file that ollama is able to import. Instructions for combining the part files can be found here: https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF

Well I guess thats fine for models with gguf files.... But ive never seen those. Take this model for example (satoshi), there's no gguf just safetensors files

https://huggingface.co/LaierTwoLabsInc/Satoshi-7B/tree/main

Yeah, I believe there are tools that can covert them but I haven’t tried. Once I found that there were already gguf files for the models I wanted to run, I just used those.

If you try the conversion tools, let me know how it goes!