LLM · Tags · undefined

Scalable Chain of Thoughts via Elastic Reasoning

Salesforce AI Research developed 'Elastic Reasoning,' a framework that enables Large Reasoning Models (LRMs) to operate effectively under strict output length constraints.

Feb 11, 2026

AI LLM Research

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

Researchers at Carnegie Mellon University introduced Length Controlled Policy Optimization (LCPO), an RL-based method that trains large language models to precisely control the length of their reasoning steps.

Feb 4, 2026

AI LLM Research

s1: Simple test-time scaling

Researchers at Stanford, UW, and AI2 developed `s1-32B`, an open-source model that achieves state-of-the-art reasoning performance and clear test-time scaling on challenging benchmarks

Feb 4, 2026

AI LLM Research

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-AI developed DeepSeek-R1, an LLM demonstrating that sophisticated reasoning capabilities can emerge through pure outcome-based reinforcement learning without reliance on human-annotated reasoning trajectories.

Jan 29, 2026

AI LLM Research