DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises. In collaboration with researchers from Tsinghua University, DeepSeek developed a technique that combines methods referred to as generative reward modelling (GRM) and self-principled critique tuning, according to a paper published on Friday. The dual approach aims to enable LLMs to deliver better and faster results to general queries.The resulting DeepSeek-GRM models outperformed existing methods, having โ€œachieved competitive performanceโ€ with strong public reward models, the researchers wrote. Reward modeling is a process that guides an LLM towards human preferences.

https://amp.scmp.com/tech/tech-trends/article/3305259/deepseek-unveils-new-ai-reasoning-method-anticipation-its-next-gen-model-rises

Reply to this note

Please Login to reply.

Discussion

These guys have flipped the industry on it's head... i know innovation when i see it....everything from filesystem innovation to compressing the data NICs send between each other during operations...and the open source is the icing on the cake. ๐Ÿ’ฏ

nostr:nevent1qqstq6de2yg7npywjw0qgdv07uxsw2k3fmpzpqz3gk80jyjs8ncnp5spz4mhxue69uhhyetvv9ujuerpd46hxtnfduhsygxkzvekwftqg2n8wuccv4dlruud3nwegn6t4fkqku80fvp9n990qspsgqqqqqqs748te5