Hi ECMWF team, @Joshua_Talib ,
Our team has noticed a discrepancy between quintiles computed with AI_WQ_package.compute_20yr_quintile_climatology.complete_20yr_quintiles and the data downloaded from the server with AI_WQ_package.retrieve_evaluation_data.retrieve_20yr_quintile_clim. This affects dates at the end of the calendar year (confirmed on Dec 22 and Dec 29, 2025), but does not affect dates prior to this (Dec 15 is normal). Here is an example for the 0.4 quintile of temperature for Dec 22:
Specifically, it appears that the retrieve_20yr_quintile_clim 2025 quintiles were formed by processing reanalysis data through Dec. 31, 2024 and computing a 7-day-rolling mean (+ 4-day window offset). However, the quintiles at the very end of the calendar year also require January 2025 data to accurately compute the rolling means. As a result, the quintiles for Dec. 22, 2025 and beyond appear to be incorrect (i.e., they deviate significantly from what you obtain when January 2025 data is also included in the computation).
The forecast submission this week will rely on the Dec 22 quintiles and therefore we want to confirm what is the “correct” quintile data that will be used during verification. Could you please check whether there is an issue in the generated quintiles, and confirm whether the quintiles that will be used for verification are the ones downloaded now with the package?
Thanks!
1 Like
With the AI-WQ-package 3.0.3, I was able to recreate the discrepancy by (1) downloading and concatenating all years of the AI-WQ training set using retrieve_training_data;.retrieve_annual_training_data; (2) computing aiwq_quintiles = complete_20yr_quintiles(training_set, rolling_operation=‘none’); and (3) comparing these computed quintiles to those retrieved via retrieve_evaluation_data.retrieve_20yr_quintile_clim. The two sources of quintiles match perfectly for dates >= 2026-01-01 and dates <= 2025-12-21 but differ for all dates in the range 2025-12-22 through 2025-12-31 inclusive.
Thank you to the Microsoft team for highlighting this discrepancy.
There was an error in my code which meant the quintiles were not automatically updated as new ERA5 data became available. This has now been corrected and you should find that the quintiles retrieved through the python package are consistent with those when computing yourself.
Forecasts will always be evaluated against an ERA5 quintile climatology based on a 20-year, +/- 4 day (at two day interval) sampling.
Sorry for the issue and please let me know if you still find an error.
Thanks,
Josh
2 Likes
Thank you Josh for the prompt fix!
@Joshua_Talib , with the new fix, I now observe discrepancies between computed and downloaded quintiles on the following days: 2025-12-28 through 2026-01-04 inclusive.
Dear Lester,
Thanks for noting. I wasn’t copying across new 2026 files. They have now been copied and you should have access to the correct quintile bounds.
I will write a new additional forum post imminently.
Kind regards,
Josh