weatherlinguist

Storytelling with weather data

A user of weather data is probably more interested on a picture from a weather app like this. image-1 (1) Figure 1. Example forecast from weather underground.

My work is entirely on the backend part of such a weather app. I process data from the weather models that produce the forecast and I try to improve them. I am more interested in plots which show the performance of one model against actual observations. These kind of plots are called verification plots. They usually show some statistical score that tells you how good was the model at predicting a given weather situation (like bias or standard deviation). An example of verification plot can be seen below. This is a verification of wave models for September 2020

image-3 Figure 2. Verification of weather models from an ECMWF report.

This shows the mean error (bias) of several wave models as a function of forecast day (a wave model is an ocean model that predicts sea heights and wave patterns for the oceans of the world). This particular example includes an intercomparison of 10 different models.

Comparing the quality of the two plots, the information displayed in the weather app provides a clear and comprehensive view of the data, despite its complexity, due to the use of icons and segmented layout. The verification plot on the other hand is very cluttered, with many lines packed together, making it difficult to follow individual trends. It is not very easy to see which model is which at first sight.

Scientitsts who code, like me, are usually not very skilled at presenting their results in an entertaining and visually pleasing way that might reach a broader audience. I have been spending some time on improving my frontend skills, in particular in the topic of data visualization. I started attending meetups and listening to talks by great visual story-tellers like Alberto Cairo or youtube talks by Edward Tufte . I recently read a short, but very useful book by Ali Fenwick, Jose Berengueres, and Marybeth Sandell called "Introduction to data visualization and story telling". One of the main take-aways I got from this book is the importance of balancing information and meaning in a good chart. In the slightly edited version of one of their plots, originally created by Hugh McLeod, you see that too much information decreases the amount of meaning in a chart.

information_meaning Information versus meaning. Slightly modified version of a plot from Hugh McLeod's book: Ignore Everybody

There is a well balanced amount of information and meaning in the chart in Figure 1. There is plenty of information in Figure 2, but the meaning is not as clear. The plot is overcrowded. There are too many colors, and it is not even clear in some places which model is which (ie, there is two models with almost the same shade of green).

The second take-away from Fenwick et al is the importance of story telling in a well organized chart to provide some actual knowledge. There is information in the scores plot above. There is no knowledge. Which model is best? Are all models equally bad? Is one model better during the first days of the forecast? It takes some valuable time to extract this information by just looking at this plot. Following some basic principles of good data visualization I would maybe re-organize the verification plot in Fig 2 like this

image Reorganizing the data in the verification plot in a hopefully more understandable way.

I think in this case is a bit clearer which model over/underpredicts for a given foreast day. Given the big mix of models in the plot above I took the data from the lines I could identify (ie, the top and bottom two models).

I list below a few books on data visualization I would recommend. Note I tend to leave out books which I find too artistic or too focused on story telling. I do like the awesome visualizations you found in places like Bloomberg or The New York Times, but at some point the become ridiculously over-complicated. I don't want to spend 10 minutes reading an explanation to be able to understand what the heck the plot is supposed to be showing.

A nice website to help you figure out which chart to choose.