Human-driven climate change is a monumental threat to our future – and analyzing data on a global scale will be key to understanding how to mitigate it.
Sitting high above us, the vast network of satellites circling Earth collect data that offers a big-picture view of the landscape over time, a perspective crucial to understanding climate change’s widespread and long-term effects. Data on emissions, on natural disasters, on new construction impacting the land – these types of insights help us understand how and where we must put our efforts. Yet, parsing through these enormous quantities of data is a challenge in and of itself.
Hundreds of terabytes of data are generated each year from Earth-observing satellites; AI and machine learning are needed to accelerate the processing and analysis of these images. Only with automation can we generate insights at the speed and scale needed to keep the data relevant.
For example, The Washington Post reported recently on an incident in which a European Space Agency satellite 520 miles above the Earth was able to identify a potentially catastrophic leak in a Russian pipeline, which was spewing the greenhouse gas methane into the atmosphere at a rate of approximately 395 metric tons an hour. Previously, the article writes, “the massive leak might have gone unnoticed.”
However, depending on the task, AI can require incredibly large volumes of data and computing power to “learn” a task. Novel machine learning techniques optimized or born geospatial are needed to fully transform geospatial image analysis and meet changing needs.
<<< Start >>>
Novel machine learning techniques optimized or born geospatial are needed to fully transform geospatial image analysis and meet changing needs.
<<< End >>>
The challenge of analyzing GIS data at scale
AI and computer vision have significantly advanced in the last few years, in part due to the increased availability of large image datasets to train AI algorithms. This has improved what is considered state-of-the-art in visual perception and our understanding in certain domains like autonomous driving. In contrast, large datasets have, in general, not included the GIS and remote sensing domains, thus slowing the progress of AI in these areas. Lack of satellite imagery accessibility, complex data formats, and large computational needs have contributed to the lack of publicly available datasets in these domains.
However, the creation of large datasets is not necessarily the optimal solution to advance AI at the scale needed for geospatial analysis. Collection and manual annotation of imagery is a very costly process. Even if an organization can scale up dataset curation to this level, AI algorithms still need to be trained with many visual examples requiring lots of computer cycles, and thus, power.
This challenge can directly contribute to climate change – the compute power required to train increasingly complex AI algorithms for GIS data in traditional ways can consume incredible amounts of electricity. To provide context, one study found that training a single natural language processing model can emit more than 5x the lifetime emissions of the average American car.
Advancements in AI analysis create new opportunities for GIS data
By transforming AI processes, we can achieve greater efficiency with less resources – helping advance what is possible with AI algorithms and GIS data.
For example, under a contract with IARPA, Accenture is teaming with seven academic institutions and other private companies, leveraging AI and advanced machine learning to significantly reduce the time and cost of analyzing large volumes of geospatial imagery. The effort is part of IARPA’s Space-based Machine Automated Recognition Techniques (SMART) program.
As part of the project, we are working toward facilitating more efficient use of satellite imagery, engineering solutions that will enable the AI to build a coherent picture based on data from multiple, heterogenous satellite platforms that can better help us identify Earth surface change signatures over time.
<<< Start >>>
<<< End >>>
These workstreams create more streamlined and effective capabilities to detect, monitor, and characterize the progression of anthropogenic or natural processes, such as heavy construction, natural disasters, or crop growth. “Current manual exploitation methods do not scale well with the data volumes we’re receiving, and there’s the problem of simultaneously analyzing data from past, current and future space-based systems,” said IARPA Program Manager Jack Cooper. “SMART innovations in data fusion and machine learning techniques will enable automated broad area search at unprecedented temporal resolution and area coverage.”
One area our team is focusing on is domain adaptation to reduce the dependencies of geographically diverse datasets. Domain adaptation seeks to extract domain-invariant information from the satellite images regardless of what geographic region is being monitored or the activity of interest being studied. AI algorithms during training focus on core common characteristics regardless of the “where” and “what.”
Another innovative approach is few-shot or zero-shot learning. By mimicking how humans learn to solve new tasks, these approaches train algorithms to have similar performance with lower volumes of labeled training data. For example, one approach to few-shot learning for object recognition consists of first training an algorithm to tell if objects in different images are the same kind of object or not, and later showing very few labeled examples of the object of interest to learn its unique characteristics. This allows an algorithm to learn general similarities and differences between objects without needing to see thousands of labeled example images of the same object to understand its particularities. This can bring significant savings in resources; when we can train algorithms on less data, there’s less compute cost, less storage costs on cloud platforms, and less human hours spent annotating data for training the AI algorithms.
Rationalizing data upfront so we can conduct analysis more efficiently is an underexplored area. Doing so will empower us with greater geospatial insights to combat the climate crisis and can also be used to advance other federal use cases, such as benefits delivery, fraud detection, or situational awareness for law enforcement safety, to name a few.