Understanding Climate Change Using High Performance Computing and Machine Learning

• 1892 words


D Watson-Parris and NASA Worldview

As the COVID pandemic continues to sequester many of us to our homes, our everyday behaviors have come mostly to a collective halt. The immediate effects are obvious, as cities, roads, and public spaces have emptied. Reports of nature intermingling with spaces once claimed by humans have amazed audiences worldwide. Coyotes casually strolling by the Golden Gate Bridge and through the streets of San Francisco, the canals of Venice running clear and teeming with fish, and the [Himalayas visible from India(https://www.insider.com/himalayas-seen-from-india-pollution-drop-coronavirus-lockdown-2020-4) for the first time in three decades are just a few of the examples made famous by popular culture.

At the same time, with tragic wildfires ravaging the Pacific Coast and an already record-setting 2020 Atlantic hurricane season underway, many are feeling a weighty pull towards action for the environment.

These are just a few examples which have made manifest the challenging and complex problem scientists have been working to understand for years: climate change. Now more than ever, technology is positioned to help scientists understand and untangle the complicated web of cause and effect unfolding across the planet.

The Science of Climate Change

Historically, the classical approaches to studying climate change required a lot of tedious manual labor. These methods typically involved differential equations, calculus, chaos theory, and butterfly effect, all of which have been used to try and understand the changes in our environment and possible causes or contributing factors to those effects. Cellular automata methods have also been helpful in modeling complex systems like fluid dynamics.

With all these methods, and especially when used within the context of climate science, they require a massive amount of data. Gathering this information from myriad sources, and labeling a high-quality dataset, was elusive in some cases or overwhelming in others. In some instances this data was relatively static, such as ocean surface temperatures, whereas others are more dynamic, like ocean current changes, adding even more interesting and possibly valuable insights to the study.

However, storing this massive amount of data was prohibitively expensive for all but the most well-funded organizations and institutions, and that’s just the beginning of the process. Building from this foundation of high quality data, taking the next step in climate science is incredibly computationally intensive.

Democratizing High Performance Computing in the Cloud

Solving large scientific and engineering problems, like predicting the weather or modeling ocean currents, requires researchers to harness massive computing power. Such huge quantities of compute is unattainably expensive for most organizations, and even for those with the means to afford it, running High Performance Computing (HPC) clusters on-premises required considerable upfront capital expenditure, lengthy procurement cycles, and regular hardware refreshes to avoid obsolescence. Today, the ability to configure massive parallel computing clusters on demand in the cloud is making available to a broader audience what was once confined to government labs and select academic organizations.

However, even just a few years ago the cloud was thought to be fit only to run "embarrassingly parallel" workloads and as such were unsuitable for the many “tightly coupled” codes, like weather modeling and other areas of climate research, that depend on fast, efficient communication between compute nodes while jobs are running. These applications generally require locally computed data to be distributed globally (over the HPC cluster interconnect) for frequent recalculation until some convergence criterion is met.

Today, that is no longer the case. With distributed high performance computing in the cloud, researchers are able to accelerate the pace of climate science using a broad range of compute optimized and accelerated computing EC2 instances (e.g. C5n and P3dn instances) that can scale to thousands of cores with network interfaces like the Elastic Fabric Adapter. Offerings such as EC2 Spot instances, which let customers take advantage of unused EC2 capacity in the AWS cloud at up to a 90% discount, make the ability to do climate science more accessible to innovators and more economical for folks already doing this.

For example, Maxar Technologies – a space technology company specializing in manufacturing communication, Earth observation, and on-orbit servicing satellites, satellite products, and related services – uses AWS to deliver weather forecasts 58 percent faster than NOAA’s supercomputer. While weather prediction models traditionally run on large, on-premises, high performance computers, Maxar developed a suite of architectures that resides in the AWS Cloud and allows scientists to run weather forecasting models in a much more nimble, scalable manner.

Maxar runs a numerical weather prediction (NWP) application on AWS cloud computing resources. The success of the program relies on Maxar being able to run its NWP application faster than NOAA does on its supercomputer because that is what will allow Maxar to deliver the weather forecast generated by the NWP application to clients with greater lead times, guiding more informed and timely decisions. Maxar also took advantage of AWS ParallelCluster, an open source cluster management tool that made it easy to deploy HPC clusters with a simple text file that automatically models and provisions resources. This program contributes to how Maxar monitors climate change, joining efforts such as measuring air pollution, understanding the destruction of hurricanes, assisting wildfire response efforts, and more.

In the process of optimization, Maxar built 37 different HPC cluster configurations to test the Finite Volume Cubed Sphere Global Forecast System (FV3GFS) application on as little as 252 cores to over 11,000 cores. Each of these clusters were built using AWS ParallelCluster to ensure consistent configuration and deployment. The entire workflow is automated from cluster spin-up to cluster spin-down. HPC cluster resources (EC2 instances, EBS volumes, FSx Lustre) are re-allocated each day. Execution workflow is automated using a series of Step and Lambda functions with SNS notifications. CloudFormation stacks build the AWS resources for the workflow, which can be deployed in any AWS region with minimal modification.

Maxar currently runs the application on both c5.18xlarge and c5n.18xlarge AWS EC2 instances. Twenty-four of the case study cluster configurations utilize c5n.18xlarge with the EFA networking adapter, while the remaining 13 configurations use c5.18xlarge instances with TCP networking. All configurations leverage a 14TB FSx for Lustre file system with a progressive file layout (PFL) across the 12 object storage targets (OSTs).

AWS provides the most elastic and scalable cloud infrastructure to run HPC applications. With virtually unlimited capacity, engineers and researchers can innovate beyond the limitations of on-premises HPC infrastructure.

Machine Learning is Key to Combating Climate Change

Paired with HPC, machine learning (ML) enables scientists to look at climate data flexibly, adapting analysis of data based on past events to more accurately model the future. This approach can help researchers grapple with the tremendous complexity of climate systems, and help them better understand the connections between the many subtle interactions that influence weather.

ML models can also be helpful to fill in some of the noisy spots or holes in the data – called multiple imputation – to create similar data or synthetic data and accelerate climate science even further when some pieces of information are either too difficult or impossible to retrieve. In short, ML can make predictions about things that are unknown, accelerating our understanding of climate science and producing more accurate models.

The burgeoning marriage of ML and climate science is evident in the research efforts of those at University of Oxford, including Philip Stier, a professor of atmospheric physics, and Duncan Watson-Parris, a post-doctoral researcher. Stier and Watson-Parris are focused on understanding how aerosols affect clouds — what kind of clouds they affect, which regions these changes occur (and, just as importantly, which regions they don’t), and how prevalent they are.


D Watson-Parris et al. (doi.org/10.1002/essoar.10501877.1)

One way they are quantifying this is through ship tracks — cloud brightening due to the aerosols emitted by a ship while passing underneath a cloud deck. While there is normally a lot of variability in the affect of aerosols, ship tracks form due to a well-defined pollution source in a space where there is little other pollution, allowing them to better isolate how a certain amount of pollution causes certain cloud changes to develop and evolve over time.

Using satellite imagery, and thousands of hand-logged instances of ship tracks, they are training ML models to find ship tracks in other satellite imagery, and, down the line, will use this well-defined scenario to create a global mean effect and scale up beyond ship tracks. Stier and Watson-Parris are also using machine learning techniques to detect and understand the effects of pockets of open cellular structure in clouds. To train these models at scale, they rely on AWS Deep Learning AMIs, which allow them to easily spin up a VM on Amazon EC2 and activate a pre-configured deep learning environment, and run frameworks like TensorFlow.

Stier and Watson-Parris are also using machine learning techniques to detect and understand the effects of Pockets of Open Cells (POC) in clouds. They developed a two-step machine learning model (after some more hand labeling of satellite imagery) using a pre-trained ResNet-152 for an initial mask, and then a Res-UNet pass to refine the edges of the POCs. From there, they trained the model using Amazon’s EC2 P3.8xlarge instances and ran inference with their data stored on S3.


D Watson-Parris et al. (doi.org/10.1002/essoar.10501877.1)

And even beyond their use of services like AWS Deep Learning AMIs and Amazon SageMaker, Oxford and Amazon are partnering through iMIRACLI (innovative MachIne leaRning to constrain Aerosol-cloud CLimate Impacts), an EU H2020 funded graduate program that brings together leading climate and machine learning scientists across Europe with non-academic partners to educate a new generation of climate data scientists — with 15 participating PhD students.

The Future of Climate Change Science is in the Cloud

In comparing the climate impact of data centers, the advantages of cloud providers versus on-premises are clear across resource utilization, energy efficiency, and power mix.

A typical large-scale cloud provider achieves approximately 65 percent server utilization rates versus 15% on-premises, which means when companies move to the cloud, they typically provision fewer than ¼ of the servers than they would on-premises. Further, a typical on-premises data center is 29 percent less efficient in their use of power compared to a typical large-scale cloud provider that uses world-class facility designs, cooling systems, and workload-optimized equipment. Add to needing fewer, more power-efficient servers that large-scale cloud providers like AWS use a power mix that is 28 percent less carbon intense than the global average, and cloud customers can end up with a reduction in carbon emissions of 88 percent.

Climate change is one of the most difficult issues of our time, and if we don’t find meaningful solutions, the consequences can have repercussions for our and our children’s future. Delivering change in this arena will require a collective effort across academia, government, industry, nonprofits, and society. It will require ingenuity, innovation, and scale. High power computing and machine learning on the cloud will be the key to unlocking scientific insights into understanding and combating climate change.

At Amazon, we stand ready to support customers like Oxford and others to tackle this evolving challenge. And we are committed to addressing climate change through renewable energy, and continue to invest in energy projects around the world, including solar farms and wind farms in places like Sweden, Ireland and Virginia. These projects will help supply clean energy to our data centers, which power Amazon and millions of AWS customers globally.