Publications

Selected Publications

DISC: Dynamic Decomposition Improves LLM Inference Scaling

NeurIPS 2025

Jonathan Light*, Wei Cheng, Benjamin Riviere, Wu Yue, Masafumi Oyamada, Mengdi Wang, Yisong Yue, Santiago Paternain, Haifeng Chen

Adaptive, compute-aware reasoning - dynamically expands or contracts thought steps for efficient problem-solving. Delivers +33% accuracy vs DeepSeek-R1 at equal tokens and 5-10% lower error on APPS, MATH, and LiveCodeBench.

Website PDF

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

ICLR 2025

Jonathan Light*, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu

LLM + MCTS hybrid that learns to optimize its own decision process through self-play and strategy refinement. Outperforms RL and existing LLM agents, achieves human-level performance in complex games. Highlighted in the State of AI Report 2024 as a breakthrough in autonomous agent design.

Code Website PDF

Scattered Forest Search: Smarter Code Space Optimization improves LLM Inference Scaling

ICLR 2025

Jonathan Light*, Yue Wu, Yiyou Sun, Wenchao Yu, Yanchi Liu, Xujiang Zhao, Ziniu Hu, Haifeng Chen, Wei Cheng

Reframes code generation as black-box optimization: uses Scattering, Foresting, and Scouting to boost exploration + feedback exploitation. Achieves SOTA on HumanEval, MBPP, APPS, CodeContests, LeetCode (e.g. 87.2% pass@1 on HumanEval) while halving iterations vs prior methods.

Code Website PDF

PIANIST: Learning Partially Observable World Models with LLMs for Multi-Agent Decision Making

NeurIPS 2024 Workshop

Jonathan Light, Sixue Xing, Yuanzhe Liu, Weiqin Chen, Min Cai, Xiusi Chen, Guanzhi Wang, Wei Cheng, Yisong Yue, Ziniu Hu

Decomposes language-described environments into intuitive components, enabling zero-shot LLM world models for efficient MCTS in multi-agent settings.

PDF

Reasoning in Reasoning: A Hierarchical Framework for Better and Faster Neural Theorem Proving

NeurIPS 2024 Workshop

Ziyu Ye, Jiacheng Chen, Jonathan Light, Yifei Wang, Jiankai Sun, Mac Schwager, Philip Torr, Guohao Li, Yuxin Chen, Kaiyu Yang, Yisong Yue, Ziniu Hu

Introduces Reasoning in Reasoning (RiR), a hierarchical planner-actor formulation that unifies decomposition and search to accelerate neural theorem proving across LeanDojo and miniF2F benchmarks.

PDF

From Text to Tactic: Evaluating LLMs Playing the Game of Avalon

NeurIPS 2023 Workshop

Jonathan Light*, Min Cai, Sheng Shen, Ziniu Hu

Introduces AvalonBench to benchmark LLM agents in strategic social deduction games and highlights the gap between current agents and engineered bots.

PDF Website Code

Dataset Distillation for Offline Reinforcement Learning

ICML 2024 Workshop

Jonathan Light*, Yuanzhe Liu, Ziniu Hu

Synthesizes compact, high-quality offline RL datasets that retain policy performance comparable to training on full datasets.

PDF Website Code