yes it is dangerous to fully trust a model generated by big corps!

training is two steps. 1: curation of data. 2: the actual changing of weights. second step is pretty automated. there are tools like llama-factory.

first step is python scripts to go thru notes and deciding what is a knowledge and what is chat. removing things like news, llm generated content. i don't want other llm generated content to influence my model.

thats another danger. bigger corp llm's are kind of accepted as ground truth while training little models. thats very scary.

Reply to this note

Please Login to reply.

Discussion

No replies yet.