Question about RPSS Evaluation

Dear Josh
When I simulate using the first-week forecast data initialized on 2025-03-06 to calculate the RPSS value, both the observation data and the climatology data are set to 2025-03-24.
I test Probabilities predicted using [0,0,0,0,1] and [1,0,0,0,0], but the RPSS is the same value.
Best wishes
Peng

from AI_WQ_package import forecast_submission
tas_p1_fc = forecast_submission.AI_WQ_create_empty_dataarray('tas', '20250306', '1', 'XXX', 'XXX', 'XXX')

tas_p1_fc[0,:,:]=0.0
tas_p1_fc[1,:,:]=0.0
tas_p1_fc[2,:,:]=0.0
tas_p1_fc[3,:,:]=0.0
tas_p1_fc[4,:,:]=1.0

from AI_WQ_package import retrieve_evaluation_data
from AI_WQ_package import forecast_evaluation
obs = retrieve_evaluation_data.retrieve_weekly_obs('20250324','tas','XXX')
quintile_clim = retrieve_evaluation_data.retrieve_20yr_quintile_clim('20250324','tas','XXX')
land_sea_mask = retrieve_evaluation_data.retrieve_land_sea_mask('XXX')

from AI_WQ_package import forecast_evaluation

# compute observed probabilities
obs_pbs = forecast_evaluation.conditional_obs_probs(obs,quintile_clim)

# compute global RPSS
global_RPSS = forecast_evaluation.work_out_RPSS(tas_p1_fc,obs_pbs,'tas',land_sea_mask)

print(global_RPSS.compute())

Can you try your code again but this time set the values component of the empty data array to 0 and 1.

For example,
tas_p1_fc.values[2,:,:]=0.0

Thanks,
Josh

tas_p1_fc.values[0,:,:]=0.0
tas_p1_fc.values[1,:,:]=0.0
tas_p1_fc.values[2,:,:]=0.0
tas_p1_fc.values[3,:,:]=0.0
tas_p1_fc.values[4,:,:]=1.0
tas_p1_fc.values[0,:,:]=1.0
tas_p1_fc.values[1,:,:]=0.0
tas_p1_fc.values[2,:,:]=0.0
tas_p1_fc.values[3,:,:]=0.0
tas_p1_fc.values[4,:,:]=0.0

I replaced the original code with the code above, but the value of global_RPSS remains the same as before.

Hi @Peng_Lu ,

Thanks for investigating this a bit further. You were correct to raise a concern.

For reasons I don’t fully understand, the following line in forecast_evaluation.calculate_RPS wasn’t working correctly:

    # RPS score for forecast
    RPS_score = ((fc_pbs_cumsum-obs_pbs_cumsum)**2.0).sum(dim=quantile_dim)

I have replaced the line with the following.

    # RPS score for forecast
    # work out squared value of cumulative difference
    cum_pbs_diff = fc_pbs_cumsum.copy()
    cum_pbs_diff.values = ((fc_pbs_cumsum.values-obs_pbs_cumsum.values)**2.0) # square the cumulative difference between forecast prob and obs prob. Call the actual data within the xarray.
    RPS_score = cum_pbs_diff.sum(dim=quantile_dim)

This updated version now calls the values within each cumulative summed array, and then takes the sum across the quintile axis.

The AI Weather Quest Python package has already been updated. I recommend you running the following on your linux environment:

python3 -m pip install --upgrade AI_WQ_package

Sorry for the inconvenience caused. I had been regularly checking RPSSs from dynamical predictions, and this did not seem to be an issue :thinking:. Hopefully this fix will solve the issue!

Thanks once again,
Josh

P.S. To be 100% transparent, here’s a copy of my test code + output:

import xarray as xr
import numpy as np
import matplotlib.pyplot as plt

from AI_WQ_package import forecast_submission
tas_p1_fc_top = forecast_submission.AI_WQ_create_empty_dataarray('tas', '20250306', '1', 'ECMWFtest', 'webinartest', password)

tas_p1_fc_bot = forecast_submission.AI_WQ_create_empty_dataarray('tas', '20250306', '1', 'ECMWFtest', 'webinartest', password)

zero_array = np.zeros([5,121,240])
zero_top = zero_array.copy()
zero_bot = zero_array.copy()
zero_top[-1,:,:] = 1.0
zero_bot[0,:,:] = 1.0

tas_p1_fc_top.values = zero_top
tas_p1_fc_bot.values = zero_bot

from AI_WQ_package import retrieve_evaluation_data
obs = retrieve_evaluation_data.retrieve_weekly_obs('20250324','tas',password)
quintile_clim = retrieve_evaluation_data.retrieve_20yr_quintile_clim('20250324','tas',password)
land_sea_mask = retrieve_evaluation_data.retrieve_land_sea_mask(password)

from AI_WQ_package import forecast_evaluation

# compute observed probabilities
obs_pbs = forecast_evaluation.conditional_obs_probs(obs,quintile_clim)

# compute global RPSS
global_RPSS_top = forecast_evaluation.work_out_RPSS(tas_p1_fc_top,obs_pbs,'tas',land_sea_mask)
print(global_RPSS_top.compute())

global_RPSS_bot = forecast_evaluation.work_out_RPSS(tas_p1_fc_bot,obs_pbs,'tas',land_sea_mask)
print(global_RPSS_bot.compute())
<xarray.DataArray ()> Size: 8B
array(-1.18782196)
Coordinates:
    forecast_issue_date    datetime64[ns] 8B 2025-03-06
    forecast_period_start  datetime64[ns] 8B 2025-03-24
    forecast_period_end    datetime64[ns] 8B 2025-03-30T18:00:00
    height                 int64 8B 2
    time                   datetime64[ns] 8B 2025-03-27T09:00:00
    variable               <U3 12B 't2m'
applying land sea mask
applying land sea mask
<xarray.DataArray ()> Size: 8B
array(-1.62055832)
Coordinates:
    forecast_issue_date    datetime64[ns] 8B 2025-03-06
    forecast_period_start  datetime64[ns] 8B 2025-03-24
    forecast_period_end    datetime64[ns] 8B 2025-03-30T18:00:00
    height                 int64 8B 2
    time                   datetime64[ns] 8B 2025-03-27T09:00:00
    variable               <U3 12B 't2m'

Thank you for your hard work. After updating the module to version 2.2, the output matches yours

1 Like