This article explores the question of **how much further the computing power used for training reasoning models can scale** and what that means for future AI progress. Reasoning models, like OpenAI's o3, are trained after their initial development using methods like reinforcement learning on difficult problems. While o3 achieved rapid improvements, reportedly using 10 times the training compute of its predecessor, o1, the author believes this rapid scaling pace is **unlikely to continue for many more orders of magnitude** before reaching the limits of available total training compute. Public information on the exact compute scale for models like o1 and o3 is limited, although data on other models like DeepSeek-R1 and expert opinions suggest that the amount spent on this specific training stage is currently relatively small compared to total AI training costs but is quickly increasing, indicating **potential for significant near-term improvements**. However, continuing to scale reasoning training could face hurdles like finding enough high-quality training data and uncertainty about how well reasoning generalizes beyond specific areas like math and coding. Despite these challenges, researchers are generally optimistic that more rapid scaling and resulting capability gains are still likely in the near future.
https://epoch.ai/gradient-updates/how-far-can-reasoning-models-scale