AI companies, including OpenAI, are reportedly moving away from the traditional approach of scaling large language models by increasing data and computing power. Instead, they are developing new training methods that emulate more human-like thinking processes. This shift comes as AI firms face delays and growing challenges in their pursuit of ever-larger models.
AI researchers have reported issues with the scaling process, noting that despite the expanding data usage and computing capacity, performance has slowed down. The rising cost of training large models, which can run into tens of millions of dollars, is also compounded by technical failures and power shortages. Additionally, the demand for data is now outpacing what is readily available.
In response, companies are exploring 'test-time compute', a technique that enhances AI models during the inference phase, where models evaluate multiple possibilities in real time to determine the best outcome. This approach is being used in OpenAI's new o1 model, which utilises multi-step reasoning and incorporates expert feedback to improve performance. Other top AI labs, such as Anthropic, xAI, and Google DeepMind, are also working on similar strategies.
The shift in focus from large-scale pre-training to inference clouds, which rely on distributed, cloud-based servers, could disrupt the AI hardware market. Nvidia’s dominance in the chip market, driven by the high demand for training chips, may face challenges as the focus moves towards inference, where new competition could emerge. AI investors are closely watching these developments, as they may alter the landscape for hardware requirements.