Latest fine tune is still hallucinating a lot. Especially when it has learned an out of date information or biases, it is hard to make it unlearn. Such as not recommending products that are no longer available or are good for the purpose.

It does not think I'm a hockey player anymore, but I did not start INESS nor have I written a book called Anarchopedia.

The fine tune is based on migtissera's Synthia-70B. Not great, not terrible. These qlora finetunes for 12 epochs (0.0003 learning rate) seem to get slight hints, but there's still a lot of bias from the original model. I have 5000 question-answer examples in training data. Any hints?

Reply to this note

Please Login to reply.

Discussion

Where can I read more about your project? I've been screwing around with some AI/ML ideas for a week and interested to understand more of what people are doing now.

I'm not publishing anything yet, still very early stage, so mostly bugreports and questions on github.

Tldr is I'm a writer and I am trying to distill the knowledge from my books and blogs and train a model to help people with topics I write about - without the traditional AI lecture.

If I ask what bitcoin hardware wallet should I use, I am not interested in lecture about how much I should invest in bitcoin and how dangerous it is - and a lecture is what you get with current models.

I currently use axolotl (also used llama.cpp's fine tune command before), and I fine tune uncensored models.