Perhaps the full model is cheaper than o1 etc. but the distills are pointless. They have the exact same cost to run, per token, as the models they are based on but are worse than.
Perhaps the full model is cheaper than o1 etc. but the distills are pointless. They have the exact same cost to run, per token, as the models they are based on but are worse than.
No replies yet.