Estimation of wind speeds with very high return periods from large datasets generated by weather prediction models : statistical aspects

C.F. de Valk, H.W. van den Brink

To assess the reliability of flood protection in the Netherlands, return values of wind speed and coastal water level for return periods up to several million years are needed. This is a major challenge, given that records of reliable wind measurements do not go back further than about 70 years.

Several ideas are currently explored to tackle this problem. One idea is to increase data volume by utilizing large datasets of simulations by numerical weather prediction models; see van den Brink (2018). However, even large datasets such as the archived ECMWF seasonal ensemble forecasts leave a considerable gap in return period to be overcome. Another idea is to use models of the tails of distribution functions which are specifically designed for extrapolation over a wide range of return periods: the Generalized Weibull (GW) tail and the more widely applicable log-Generalized Weibull (log-GW) tail, with the 1-parameter Weibull tail as a special case of both.

For these models and for two classical tail models, the Generalized Pareto (GP) tail and the exponential tail, we compared estimates of the tail of the wind speed distribution derived from subsets of the ECMWF System-4 seasonal ensemble forecast wind speeds for a location in the central North Sea. One check concerns the statistics of maxima over subsamples as proposed in van den Brink and Können (2008)In addition, we checked the accuracy of each of these models in estimating the 107 -year wind speed from 72-year subsets of the data, using estimates from the full dataset as reference. Based on the results of this check, estimates of worst case bias in extrapolations were made for extrapolations from the full data set as well as from a 72 year subset. These analyses were repeated on the more recent SEAS5 seasonal forecast data as well as on annual maxima of wind speed from a large number of runs over 1981-2009 of the climate model Speedy.

The three datasets give starkly different results. In particular the wind speed distribution of SEAS5 differs considerably from distribution of System-4 wind speed: wind speeds are lower overall, but the tail is heavier, resulting in much higher estimates of return values. Moreover, the SEAS5 tail is less regular than the System-4 tail, resulting in larger estimation errors. This calls for a further investigation of the cause of this difference: the uncertainty appears to be dominated by potential bias related to the model formulation.

Comparison of estimates based on different models of the tail reveals that the classical GP tail, the exponential tail and the 1-parameter Weibull can be severely biased, depending on the dataset. Overall, the GW tail performs best; the 1-parameter Weibull tail can give better estimates if these are stable as a function of threshold.

Taken together, the results indicate that accurate estimation of the wind speed for a return period up to 107 year from large model datasets such as System-4 and SEAS5 is feasible (with RMS error below 2 m/s), provided that systematic differences between these datasets can be resolved.

The outcomes of the present study can already be helpful to assess and possibly improve the return values of wind speed currently in use to assess flood safety in the Netherlands. It is recommended that GW-tail based estimates from measurements at different sites are compared to estimates based on the GP and exponential tails, addressing in particular the uncertainty of estimates. The analysis of uncertainty should also address the effects of interannual variability, which has been largely ignored until now.

This research was carried out for Rijkswaterstaat and the KNMI MSO project “Towards future climate proof statistical methods for KNMI products on extremes”. We thank Marcel Bottema and Pieter van Gelder for their reviews of this document

Bibliographic data

C.F. de Valk, H.W. van den Brink. Estimation of wind speeds with very high return periods from large datasets generated by weather prediction models : statistical aspects
KNMI number: WR-20-01, Year: 2020, Pages: 55

Download full publication

download PDF (2.69 MB)