That's not really Deepseek R1, it's a distilled version of Alibaba's Qwen-32B architecture, enhanced using synthetic outputs from the larger DeepSeek R1 model.
Quite useful but not hte same thing.
It *is* r1, which is the name for the distilled version as you describe. The bigger model is called v3.
Please Login to reply.
No replies yet.