Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling.

https://arxiv.org/abs/2502.06703

Reply to this note

Please Login to reply.

Discussion

No replies yet.