Agentica

Welcome to the Agentica Project! 👋

We are a open-source initiative spawning from Berkeley Sky Computing Lab to democratize reinforcement learning (RL) techniques and develop scalable systems for large language models (LLMs) and agents.

rLLM: Reinforcement Learning for Language Agents

We release rLLM, an open-source framework for post-training language agents via reinforcement learning. With rLLM, you can easily build their custom agents and environments, train them with reinforcement learning, and deploy them for real-world workloads.

Date: July 1, 2025 | Estimated Reading Time: 10 min | Author: Sijun Tan, Michael Luo, Colin Cai

DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL

We release DeepSWE-Preview, a 32B software engineering agent (SWE) trained with purely RL that achieves 59% on SWEBench-Verified with test-time scaling,(42.2% Pass@1), topping the SWEBench leaderboard for open-weight models.

Date: July 1, 2025 | Estimated Reading Time: 20 min | Author: Michael Luo, Naman Jain, Jaskirat Singh

DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level

We release DeepCoder-14B-Preview, a code reasoning model finetuned from Deepseek-R1-Distilled-Qwen-14B via distributed RL. It achieves an impressive 60.6% Pass@1 accuracy on LiveCodeBench (+8% improvement), matching the performance of o3-mini-2025-01-031 (Low) and o1-2024-12-17 with just 14B parameters.

Date: July 1, 2025 | Estimated Reading Time: 15 min | Author: Michael Luo, Sijun Tan, Roy Huang

DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL

DeepScaleR is an open-source effort to fully democratize reinforcement learning (RL) for LLMs and reproduce DeepSeek R1 and OpenAI O1/O3 at scale on real tasks. We introduce DeepScaleR-1.5B-Preview, a language model finetuned from Deepseek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning (RL). It achieves an impressive 43.1% Pass@1 accuracy on AIME 2024 surpassing the performance of OpenAI's o1-preview with just 1.5B parameters...

Date: February 10, 2025 | Estimated Reading Time: 10 min | Author: Michael Luo, Sijun Tan