How WindBorne’s Newest AI Forecasting Model is Primed to Tackle the Global Weather Resolution Gap

Clay Malott, Machine Learning

Read the article

Takeaways

Most global weather models aren’t able to forecast beyond 0.1 and 0.5 degree resolution, which makes it difficult to offer useful forecasts in areas with variable terrain like bodies of water, mountains, and other forms of elevation.
Physics-based models aren’t well-positioned to solve this global resolution gap. Their infrastructure makes it difficult to run models over large domains, and they require high compute set-ups.
With our new in-house AI model, internally known as “Pointy,” WindBorne has developed a method to solve this gap by distilling low-resolution grids into hyper-localized point forecasts.
Our “Pointy” model is a transformer-based model that allows one to downscale a 0.25 degree resolution model to produce point-forecasts at arbitrarily high resolution, aware of elevation and soil-type data around the globe. Higher-resolution models have the potential to improve lives, communities, and economies globally.

Weather forecasting has come a long way since its inception, evolving from simple observations to complex computer models that simulate the atmosphere's behavior. At the forefront of this evolution is the challenge of balancing broad-scale predictions with the need for highly localized, accurate forecasts. In this post, I’ll delve into the intricacies of modern weather modeling, the limitations of traditional grid-based systems, and how WindBorne is pushing the boundaries of high-resolution weather prediction.

‍

The role of structured grids

Weather models work by taking initial conditions and then integrating those conditions forward in time using a set of a few dozen equations known in atmospheric science as the “fundamental equations.” These equations govern the behavior of our atmosphere, from large-scale phenomena like how storms develop and propagate around the globe, to how water vapor nucleates around small particles of dust to form clouds and rain.

However, in order to do this integration and step the model forward in time, it needs to be mapped on a structured grid that remains the same for every output from the model. This consistency in grid structure ensures that the model's calculations are coherent and comparable across different time steps, allowing for accurate predictions and analysis of evolving atmospheric conditions.

‍

The challenge with high-resolution forecasts

One massive limitation of the grid approach is small-scale forecasting. The value of a forecasted variable in a given grid – whether it be wind, precipitation, temperature, etc – represents the average of the values of that grid box. In other words, if the model’s forecast was perfect and you took observations on every square foot of the grid and then averaged them, the model output and the mean of the observations would match.

For areas like the ocean or the plains of the Midwest United States, this approach works well; within grids over regions like these, there is low variability in the surface type (water/soil type), elevation, and other characteristics that influence the hyper-local weather.

However, in places where a grid has high sub-grid-scale elevation variability like in the mountains, or if a grid is over a coastline where half is over water and half is over land, the grid approach presents some issues. Say you’re forecasting for a region that has a large mountain and a large valley right next to each other within the same grid. The weather model will output the average value of the grid, which will be very far off from the value in the valley and the value on the mountain. Most global weather models are between 0.1 and 0.5 degree resolution, meaning individual grids are ~10-50km. Over scales this large, sub-grid-scale variability in surface characteristics can be extremely consequential.

The solution to fix this problem is simply higher-resolution models. However, traditional physics-based models are extremely computationally expensive and running these mesoscale models over large domains quickly becomes infeasible with current computing infrastructure. While many weather centers around the world have high-resolution models that forecast grid scales of 2 to 10 km known as “mesoscale models,'' these models are restricted to small, local domains over the nation’s area of interest. This means that billions of people around the world are left without quality high-resolution forecasts.

‍

Meet Pointy: WindBorne high-resolution AI model

WindBorne’s own machine learning-based global weather model runs at 0.25 degree resolution, meaning that it faces the same challenges as traditional weather models working on a coarse grid. (Here’s an overview over our model, WeatherMesh). At WindBorne, we’re tackling this problem through a new machine learning model: Pointy.

Pointy turns these low-resolution grids into hyper-localized point forecasts. It works by taking a latitude and longitude, processing surrounding low-resolution gridded forecast data, combining that with hyper-localized information about surrounding elevation, soil type, vegetation cover, etc, and generating a single forecast with seven variables (which we are working on quickly expanding). Since the model distills low-resolution gridded data into a single, hyper-local point, we aptly named it “Pointy.”

Pointy is trained using decades of directly observed surface weather conditions from weather stations around the globe. The input data to predict these surface conditions comes from the European Centre for Medium-Range Weather Forecasts’ ERA5 gridded reanalysis dataset, which is our best guess of global gridded atmospheric conditions at any given time over the last 45 years. The model essentially learns how to go from low-resolution gridded ERA5 reanalysis to station observations.

Pointy is trained on direct surface observations from high-quality ground stations, as well as a larger set of lower quality measurements, totalling 110k stations and 6 trillion observations.

Pointy can be run to generate hyper-localized point forecasts with low-resolution WeatherMesh input out to 16 days. However, it can also be used to efficiently generate high-resolution gridded domains. Using given domain bounds and an input resolution, we pre-generate latitude and longitude pairs at the specified resolution within the domain to create a mesh of individual points. These points are then run through Pointy, where it takes the low resolution WeatherMesh data and turns it into a series of high-resolution point forecasts. We then restructure Pointy’s hyper-local point forecasts back into the input grid to assemble a high-resolution gridded forecast. Since it’s all point-based and not restricted by any sort of pre-structured grid, the effective resolution of Pointy is technically infinite; if we wanted to, it could be used to generate a forecast with 1 meter resolution.

‍

Pointy’s Progress

Major improvements are coming to Pointy in the coming weeks and months. We’re experimenting with the model architecture to make Pointy faster and more accurate. Additionally, in the past month, we ingested and performed quality control on new data that proliferated the size of our training dataset from 1.2 billion observations to 5.5 billion observations. Once Pointy is finished re-training on that new data, it will be even more accurate and should rival the accuracy of traditional high-resolution physics-based models at just a fraction of the cost and computing time. Below you can see the new training set compared to the old one by observations per year:

WindBorne is wholly committed to delivering precise, localized weather predictions. By merging cutting-edge machine learning techniques with extensive meteorological data, we are setting new standards for accuracy and resolution that transcend traditional modeling constraints.

‍

Democratizing High-Quality Forecasts‍

As Pointy evolves, not only does it promise to enhance the granularity of weather predictions, but it also aims to democratize high-resolution forecasts, making them accessible to regions and communities previously underserved by conventional physics-based models.

Stay tuned as we continue to push the boundaries of what's possible in AI-based weather forecasting, ensuring that every point on the map has access to reliable, hyper-local weather information that can profoundly improve lives and economies around the globe.

‍