Hi everybody,
I tried to have a look at the newly integrated seasonal forecasts of the Bureau of Meteorology, but upon download I noticed that the dimension names slightly diverged from the other seasonal forecasting systems. Specifically, the dimension for the time of forecast (in the other datasets referred to as ‘forecast_reference_time’) is called ‘indexing_time’ in the NetCDF file.
Secondly, looking at the Grib file, it is incredibly large (3Gb) compared to the other seasonal forecasts (i.e. DWD, around 50Mb). It includes a data variable which includes the dimensions ’ time’ and ’ step’, which in the array are mostly nan. I was wondering if I am misunderstanding some elements of the staggered hindcast of the BOM.
For details, see the request code down below:
dataset = “seasonal-monthly-single-levels”
request = {
“originating_centre”: “bom”,
“system”: “2”,
“variable”: [“total_precipitation”],
“year”: [“2007”],
“month”: [“05”],
“leadtime_month”: [
“1”,
“2”,
“3”,
“4”,
“5”,
“6”
],
“data_format”: “grib”,
“product_type”: [“monthly_mean”]
}
client = cdsapi.Client()
client.retrieve(dataset, request).download()
I’d be happy to provide further details if that could help clarify things!
Kind regards,
Maarten
Hi @Maarten_Verbrugge
Regarding the issue with the naming there are two elements that might be helpful:
- Some additional documentation will be published in the coming days closer to the integration of BOM in the C3S real-time forecast processing (May 2025), and when that will be available it will make easier to spot that BOM ACCESS-S2 is a forecast system using lagged start dates to build their ensemble, similar to other forecast systems already available at C3S (e.g. UK MetOffice, NCEP and JMA).
- For such systems the encoding of monthly aggregated data might result confusing, so the concept of “nominal start date” can be useful, and in relation to the time coordinates they should be also useful some of the details you can find in the following page: Guidelines to decode monthly C3S seasonal forecast data - Copernicus Knowledge Base - ECMWF Confluence Wiki
Regarding the issue you reported about data sizes in GRIB, I haven’t been able to reproduce it. When running the request you sent as an example you get from the CDS a GRIB file of ~22MB, whilst if you adapt the python CDSAPI retrieval to get DWD GCFS2.2 data for the same variables/dates/leadtimes you get a GRIB file of ~34MB.
- In any case, if you are trying to read those GRIB files with xarray’s cfgrib engine, it might be useful to have a look at the section “Change representation of forecast lead time” in one of our training Jupyter Notebooks for C3S seasonal forecast datasets
I hope those comments sound useful.
Regards,
Edu
1 Like
Hi Edu,
Thanks for your quick reply. You are right with regards to the file size: I jumped to conclusions too quickly when I noticed the size of the file after I opened it with xarray using cfgrib.
I have worked with the other lagged ensembles before, but clearly will need to take a second look at those, especially when using the GRIB files. Your last link will benefit me in doing so.
Thanks again for your help!
Kind regards,
Maarten