C3S multi-system methodology for verification

I would like to verify the most recent C3S multi-system monthly forecasts (t2m) against station observations in the Netherlands. For this purpuse I developed a methodology, but I am not entirely sure whether the methodology is correct. Here is the approach:

  1. Get hindcast data from all institutes contributing to the C3S-multi-system. This includes the most recent version of each model, with data between 1993 and 2016 (the hindcast period). The dataset is seasonal forecast: monthly statistics on single levels.
  2. Scale variance per model to mean variance of models per month (should this be done per month per leadtime? In other words, how is this done for the C3S-multi system?
  3. Compute average scaled ensemble mean anomaly per month and leadtime
  4. Detrend computed anomalies and station data to account for global warming.
  5. Compare model to station data covering the same time period.

Could someone comment on my approach; i.e. whether I have missed something?