Can we make AI less power-hungry?

In November 2024, the U.S. Federal Energy Regulatory Commission (FERC) rejected Amazon’s request to directly purchase 180 megawatts of power from the Susquehanna nuclear plant for a nearby data center. The decision underscored concerns about fair grid usage as data centers increasingly drive electricity demand. After two decades of flat electricity demand, power consumption forecasts are now rising sharply, fueled in large part by AI-driven computing needs.

This growing demand traces back to the 2012 AlexNet breakthrough, which demonstrated the potential of deep learning models trained on GPUs. Since then, AI models have scaled dramatically, from tens to hundreds of GPUs, and eventually to clusters of thousands. While early data center power consumption remained stable due to efficiency improvements, the introduction of large-scale transformer models like ChatGPT in 2022 marked a turning point. Since then, AI energy consumption has surged, pushing the limits of data center efficiency.

According to Lawrence Berkeley National Laboratory, U.S. data center electricity usage grew from 76 TWh in 2018 to 176 TWh in 2023. AI model training, particularly for large language models (LLMs), is a key driver. Training efforts require thousands of GPUs operating at full capacity for extended periods. OpenAI, for instance, reportedly used over 25,000 Nvidia Ampere GPUs for 100 days to train GPT-4, consuming an estimated 50 GWh—enough to power a small town for a year. Despite ongoing efficiency gains, training now accounts for roughly 40% of an AI model’s total energy use, while the inference phase—processing daily queries—consumes the remaining 60%.

To counteract rising AI energy consumption, researchers are exploring optimization strategies. Techniques like pruning and quantization reduce computational requirements by trimming unnecessary neural network parameters and optimizing memory storage. Nvidia’s quantization-aware training has demonstrated up to a 51% reduction in memory needs while maintaining performance. Another approach, pioneered by researcher Jae-Won Chung, involves optimizing GPU workload distribution through a tool called Perseus, which dynamically adjusts GPU speeds to improve efficiency without delaying processing times.

However, concerns remain about the overall trajectory of AI energy consumption. Estimates for future data center electricity demand vary widely. A Lawrence Berkeley Lab study predicts that by 2028, U.S. data centers could require between 325 and 580 TWh annually, representing up to 12% of the nation’s electricity consumption. Similarly, Goldman Sachs projects an 8% share by 2030, while EPRI places estimates between 4.6% and 9.1%. Some regions, like Virginia and Ireland, already see disproportionate impacts, with data centers consuming as much as a quarter of total electricity production.

A key issue is transparency. Companies like OpenAI and Google have not disclosed precise power consumption figures, making third-party estimates uncertain. Researchers argue that without clear data, effective energy efficiency strategies remain difficult to develop. While hardware advancements and software optimizations continue, the fundamental question remains: can efficiency improvements keep pace with ever-expanding AI workloads? If not, the industry may soon face serious energy constraints. As Chowdhury of the ML Energy Initiative puts it, the real challenge is not just managing AI energy consumption, but ensuring it remains sustainable in the long term.

https://arstechnica.com/ai/2025/03/can-we-make-ai-less-power-hungry-these-researchers-are-working-on-it