Did you train it on that M2 it runs on or on some heavyweight desktop or GPU rig? 🤔
Discussion
GPU rig, but Mac with more RAM is also doable using llama.cpp. I've tried that too. It's maybe 10x slower, but if you already have a Mac, it's one day instead of two hours. Of course if you have a few bucks to spare on cloud, GPU time for QLoRA is very cheap, but I decided not to use cloud - in order to learn and also for proper cypherpunk.