Machine learning is everywhere. It now drives everything from tumor-detecting algorithms to facial recognition programs and, for those who remember Silicon Valley, an app that tells you whether or not something is a hot dog.
This type of artificial intelligence is now so ubiquitous that we don’t think twice about it. It’s foundational technology. But as people have gotten used to “ML everywhere,” they’ve overlooked one of the challenges with machine learning: its energy impact. We’re working to mitigate that impact, looking toward sustainable, energy-efficient machine learning.
Today, training highly complex models often requires staggering energy consumption. Researchers who reviewed a prominent architecture used for natural language processing found that training the model once required more than 650,000kWh of energy over 84 hours. This generated roughly the same estimated CO2 emissions impact as 57 humans would over the course of an entire year of their lives.
Without new approaches, machine learning’s energy impact could quickly become unsustainable. So how can we use it responsibly?
To start, we need a good understanding of the relationship between the approaches used to train a model and the energy they require. We conducted several experiments to measure energy consumption for model training, tweaking different parameters of the architecture.
We used a publicly available data set introduced by the British statistician and biologist Ronald Fisher in 1936. For each of three species of iris flowers, it contains 50 samples. We decided to investigate what happens when we train a small neural network on this (tiny!) data set.
Iris virginica. Photo by Frank Mayfield.
A machine learning algorithm makes a number of passes, or “epochs,” over a data set while being trained. In our research, there was always a threshold of epochs at which the accuracy of the model quickly plateaued, but energy consumption continued to increase. For example, as we progressed through the training, the model consumed only 964 joules of energy to achieve a training accuracy of 96.17%. But to gain an additional 2.5% accuracy improvement, the model required more than 15 times more energy in additional training – 15,077 additional joules!
To put that in practical terms, that amount of energy could light a 7W LED lightbulb in a rural household for almost 40 minutes. If that doesn’t seem like much of an impact at first glance, remember what we said about machine learning being ubiquitous. If we can save even small amounts of energy each time machine learning models are trained, we could have a significant impact on energy use and sustainability.
We also found that larger training data sets require significantly more energy to train models on (as you might expect) but don’t necessarily lead to a proportional benefit in accuracy. In one experiment with a small convolutional neural network (CNN) model, we tried using just 70% of the training data and compared that with the results of using the entire set. Using the whole set consumed 47% more energy, but the resulting model’s accuracy barely outperformed the one trained on just 70% of the set. The improvement in accuracy wasn’t even 1%.
In short: there are viable paths today to training machine learning models in a sustainable, energy-efficient way. You might start by thinking about your use case: just how accurate does your model need to be? If you’re classifying medical imaging to help doctors diagnose patients, maximizing accuracy via more training epochs or a larger data set may be worth the extra energy impact. If you’re using the technology for a less critical purpose, there might be a lower accuracy target that would meet your needs and save energy in training.
There are other technological options to consider as well. Do you even need to create and train a new model from scratch? Transfer learning, where an existing model is repurposed for a different task, may be another option to save energy as well as time.
We’re proposing an overall approach for machine learning that’s akin to what’s done when doing final software testing to triage any remaining bugs in a system. In those cases, the software’s overall reliability level is balanced against the effort that would be required to find and remove any further bugs without introducing new ones. If that effort would be highly intensive and the software is already acceptably reliable, the software is released.
We need a similar approach to make informed decisions about training and model accuracy while being energy efficient with ML. We’re creating an advisor that highlights the implications of ML design, development, and testing choices on energy efficiency and sustainability. And don’t forget the ongoing rapid advances in specialized hardware and computing frameworks for machine learning. Traditional computing architectures need a lot of power to perform machine learning tasks. But approaches like neuromorphic computing (which our colleagues in the Future Technologies group are working on) are a better match for machine learning needs, and as they reach maturity, they’ll provide another path toward energy-efficient ML.
There’s a growing community and effort around creating more efficient and sustainable machine learning, and for good reason. Stay tuned to learn more about our research here at Labs! Want more information about our work, or interested in collaborating? Contact Vibhu Sharma and Vikrant Kaulgud.