Understanding Reanalyses: Reconstructing the Past to Better Predict the Future
I recently participated in the production of DANRA, a high-resolution regional reanalysis for Denmark that was just released. Working on this project gave me a deeper appreciation for what reanalyses are and why they matter. DANRA covers 34 years of Danish weather at 2.5 km resolution, capturing everything from the devastating December 1999 hurricane-force storm to the July 2022 national temperature record. It's been quite a journey seeing this dataset come together, and it got me thinking that it might be worth explaining what reanalyses actually are and why we spend so much effort producing them.
What is a reanalysis?
A reanalysis is essentially a historical weather forecast run backwards. More precisely, it's a comprehensive reconstruction of past weather and climate conditions that combines historical observations with modern numerical weather prediction models. Think of it as taking all the weather observations we've collected over the yearsâfrom weather stations, satellites, ships, aircraft, and radiosondesâand feeding them into a consistent, state-of-the-art weather model to create a complete, gridded picture of the atmosphere over time.
The key word here is "consistent." Unlike operational weather forecasts, which are constantly being updated with new model versions and improved physics, a reanalysis uses a fixed model and data assimilation system throughout the entire period. This means that any trends or changes you see in the data reflect actual changes in the atmosphere, not changes in the model or observing system (though the latter can still introduce some artifacts).
Why do we need reanalyses?
The observational record of weather is patchy. Weather stations are unevenly distributed across the globe, with large gaps over oceans and remote regions. Satellite data only goes back a few decades. And even where we have observations, they're point measurementsâwe don't directly observe what's happening between stations or at different heights in the atmosphere.
Reanalyses fill in these gaps. By using a numerical weather model constrained by observations, we can estimate the state of the entire atmosphere at regular intervals (typically every hour or every few hours) on a regular grid. This gives us a complete, four-dimensional picture of the atmosphere that's invaluable for:
- Climate research: Understanding long-term trends and variability
- Model validation: Testing and improving weather and climate models
- Impact studies: Assessing how weather affects agriculture, energy, infrastructure, etc.
- Training AI models: The recent explosion in data-driven weather forecasting models (like GraphCast, Pangu-Weather, and FourCastNet) has been largely enabled by reanalysis datasets, particularly ERA5
- Climate adaptation: Planning for future climate risks based on historical patterns
The reanalysis landscape
The most widely used global reanalysis is probably ERA5, produced by ECMWF. It covers the period from 1940 to present at roughly 31 km resolution with hourly output. ERA5 has become the de facto standard for many applications, and it's freely available through the Copernicus Climate Data Store.
But global reanalyses can't capture everything. Regional reanalyses like DANRA, CARRA (Copernicus Arctic Regional Reanalysis), and CERRA (for Europe) use higher-resolution models to better represent local features like coastlines, mountains, and land-sea contrasts. For Denmark, with its 400+ islands and 7,400 km of coastline, this matters a lot. The 2.5 km resolution of DANRA can resolve features that the 31 km ERA5 simply can't see.
Other notable reanalyses include:
- NCEP/NCAR Reanalysis: One of the earliest, dating back to 1948
- JRA-55: The Japanese 55-year reanalysis
- MERRA-2: NASA's Modern-Era Retrospective analysis for Research and Applications
How are reanalyses made?
The production of a reanalysis is a massive undertaking. It involves:
- Collecting observations: Gathering decades of data from multiple sources and formats
- Quality control: Checking for errors and inconsistencies in the observations
- Data assimilation: Using sophisticated mathematical techniques to blend observations with model forecasts
- Model integration: Running the weather model forward in time, constrained by observations
- Post-processing: Computing derived variables and statistics
- Validation: Comparing the reanalysis against independent observations
For DANRA, this meant running the HARMONIE-AROME modelâthe same model used for operational weather forecasting in Denmarkâover 34 years of data. The computational cost is substantial, but the result is a dataset that outperforms ERA5 for near-surface weather parameters over Denmark, particularly during extreme events.
The difference between reanalysis and other climate datasets
It's worth distinguishing reanalyses from other types of climate data:
Climate atlases typically provide long-term averages (monthly or annual means) derived from observations. They're useful for understanding climatology but don't capture short-term variability or provide the spatial completeness of reanalyses.
Climate projections use climate models to simulate future conditions under different greenhouse gas scenarios. Unlike reanalyses, they're not constrained by observations (except during the historical period).
Operational forecasts are real-time predictions of future weather. They use the latest model versions and observations but aren't consistent over time, making them unsuitable for trend analysis.
Limitations and caveats
Reanalyses aren't perfect. Some things to keep in mind:
- Observation biases: If the input observations are biased, the reanalysis will be too
- Model limitations: The reanalysis can only be as good as the underlying model
- Changing observing systems: The introduction of satellites in the 1970s, for example, can create artificial jumps in reanalysis quality
- Spin-up effects: Some variables (like soil moisture) take time to adjust to observations
- Computational constraints: Even modern reanalyses make compromises on resolution and physics
Despite these limitations, reanalyses remain one of our most valuable tools for understanding past weather and climate. They're the foundation for countless research studies and increasingly for practical applications in climate services and risk assessment.
Looking ahead
The field of reanalysis is evolving. ECMWF is working on ERA6, which will push to even higher resolution. Regional reanalyses are becoming more common as computing power increases. And there's growing interest in using machine learning techniques to improve data assimilation and post-processing.
For me, working on DANRA has been a reminder of how much effort goes into creating these datasets and how valuable they are for understanding our changing climate. If you're working with weather or climate data, chances are you're using a reanalysisâeven if you don't realize it. And that's a good thing.
DANRA is freely available and distributed through the Copernicus Climate Data Store. If you're interested in high-resolution weather data for Denmark, check out the paper for more details.
A more thorough description of the data set, including example notebooks and details to access the dat in zarr format can be found in DANRA's official documentation