Erik Jonker<p>Train your own R1 reasoning model with Unsloth.<br>"We've enhanced the entire GRPO process, making it use 80% less VRAM than Hugging Face + FA2. This allows you to reproduce R1-Zero's "aha moment" on just 7GB of VRAM using Qwen2.5 (1.5B)"<br><a href="https://mastodon.social/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ai</span></a> <a href="https://mastodon.social/tags/reasoning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>reasoning</span></a> <a href="https://mastodon.social/tags/unsloth" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>unsloth</span></a> <a href="https://mastodon.social/tags/opensource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>opensource</span></a> <a href="https://mastodon.social/tags/locally" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>locally</span></a> <a href="https://mastodon.social/tags/grpo" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>grpo</span></a><br><a href="https://unsloth.ai/blog/r1-reasoning" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">unsloth.ai/blog/r1-reasoning</span><span class="invisible"></span></a></p>