What are ways to reduce the environmental impact of using AI? If you’re committed to AI and you’re committed to reducing your carbon emissions, then you’re going to face something of a conundrum. All the main cloud providers have reported significant increases in their category three emissions due to data center expansion and the inherent inefficiency of current AI processors and models. Just recently DeepSeek claimed that its model was trained for far less cost and with fewer resources, but it’s very difficult to verify the claims and, in any case, the model was trained on the output from current LLMs such as Chat-GPT 4, so the power had already been consumed elsewhere.
In this blog we'll explore:
There is some hope though. Until recently it has been extremely difficult to get reliable analysis of the relative efficiency of LLMs, but now Hugging Face has created its AI Energy Score leaderboard that allows direct comparison between models.
AI Energy Score Leaderboard - a Hugging Face Space by AIEnergyScore
The scoring methodology is currently limited in that there’s no obvious analysis of efficiency versus accuracy. If a model is efficient but you need to question it multiple times to evince a reliable response, then the base efficiency is somewhat irrelevant.
Nevertheless, this is a solid step forward and, as it evolves, provides Chief Sustainability Officers with a way to begin to steer IT decisions.
Small language models (SLMs) can be extremely effective for specific tasks. If you’re looking to use your AI to interrogate a specific type of document, say legal documents, then there may be a SLM that has been specifically trained to work with your document types. Such SLMs can be hosted locally or in your preferred cloud platform and can greatly reduce the environmental and monetary cost per transaction.
Linked to the use of SLMs, it’s becoming increasingly likely that we can move some of the AI workload to the edge, i.e. to laptops, local servers or even mobile devices, so long as the models are small and reliable enough to run locally. This has the potential to dramatically reduce the amount of data center expansion required to power the AI revolution and could also provide a more immediate user experience. A new breed of ‘AI PCs’ recently arrived on the market that include AI neural processors alongside the CPUs we’ve come to know and love. There aren’t many compelling use cases for these new AI PCs as yet, but as their capabilities expand and prices come down (they’re currently significantly more expensive than regular PCs), the software providers are bound to start exploiting their capabilities.
As I mentioned in my last blog post, we have found that composite AI is a super-powerful tool to improve the quality of output from AI. By targeting the AI much more accurately to specific content and by employing meta data tagging to focus the AI on relevant data, we not only reduce the processing involved but provide fewer than 1% hallucinations when working on corporate documents of known provenance. This is orders of magnitude better than the performance achieved by generic AIs such as Microsoft Copilot. A comparison of the output from Atlas Fuse and Copilot is mind-boggling. Copilot produces different responses to the same question asked at different times and fabricates important data such as financial values. Atlas Fuse, in comparison, is consistent and provides accurate responses.
Equally important is that your AI provides citations that enable you to delve into the source of its response. Without these clear hyperlinks to the sections of text used to create the response, your AI is a black box. That means that you’ll spend many wasted hours trying to understand its rationale and consume even more energy-inefficient compute time for no benefit.