Yeh, Iām down the LLM rabbit hole lately, but check this out. You can now run LLAMA in a low-end Digital Ocean droplet.
Discussion
Same, still getting a handle on what they can do, with LoRAs and all. It's tough to keep up with everything.
Now I have to try this 30B llama model š
Oracle Cloud used to give you 4xA100 High Performance ARM processes with 24gb of Ram for free might be worth a look.
I use them for running BTC Pay server but this might be a better use :)