Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Deepseek R1 Zero learns to reason using reinforcement learning on base model [pdf] (github.com/deepseek-ai)
6 points by virde on Jan 20, 2025 | hide | past | favorite




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: