Changes to grib to netCDF converter on CDS-Beta/ADS-Beta

Hi Michela,

Much appreciated for the new support page regarding the GRIB to netCDF converter, including allowing a ‘netcdf_legacy’ option for the data format.

Thanks,

  • Kris R.

Hi Eduardo,

we are working hard to manage the grib files of the SEAS5 multim members files but we are encountering several issues. When we try to manipulate them with cdo it seems unable to manage the levels of the members in the grib:
“cdo -b F32 -sub SEAS5_T2M_GLOBAL_ABS_M202406_51ENS.nc SEAS5_T2M_GLOBAL_ABS_M202406_51ENS.grib SEAS5_T2M_GLOBAL_ABS_M202406_51ENS_NC_GRIB.nc
cdo sub: Filling up stream2 >SEAS5_T2M_GLOBAL_ABS_M202406_51ENS.grib< by copying the first variable of each timestep.
cdo sub: Processed 2 variables over 12 timesteps [0.92s 28MB].”

I also tryied to open the GRIB using GRADS and wgrib but in this case GRADS gives backs the message “Cannot contour grid - all undefined values”, here I send the flux of the test:
ga-> open SEAS5_PRATE_GLOBAL_ABS_M202409_51ENS.ctl
Scanning description file: SEAS5_PRATE_GLOBAL_ABS_M202409_51ENS.ctl
Data file SEAS5_PRATE_GLOBAL_ABS_M202409_51ENS.grib is open as file 1
LON set to 0 360
LAT set to -89.5 89.5
LEV set to 1 1
Time values set: 2024:8:1:0 2024:8:1:0
E set to 1 1
ga-> q file
File 1 : SEAS5_PRATE_GLOBAL_ABS_M202409_51ENS.grib
Descriptor: SEAS5_PRATE_GLOBAL_ABS_M202409_51ENS.ctl
Binary: SEAS5_PRATE_GLOBAL_ABS_M202409_51ENS.grib
Type = Gridded
Xsize = 360 Ysize = 180 Zsize = 1 Tsize = 1 Esize = 1
Number of Variables = 1
tpsfc 0 228 ** surface Total precipitation m s**-1
ga-> d tpsfc
Cannot contour grid - all undefined values

We have such issues only for the grib of the multimember files of SEAS5 and currently the only way to use them is by means of “netcdf_legacy” option download.
I would like to ask which software we can use to manipulate these grib.

Thank you advance.

Best,
Marco

Hi Marco,
thanks for your additional details.

It is yet a bit unclear to me wether you are having issues with the netCDF files produced with the new conversion tool or with the GRIB files (which haven´t changed at all during this CDS migration process, and which are completely out of the scope of this discussion topic in the Forum)

In any case, as a way to better assist you with your issues, I will create on your behalf a ticket at ECMWF User Support platform so we can keep track of your issues. Please have a look at your email inbox as in the coming minutes you should receive notifications from JIRA in relation to this topic.

Best regards,
Eduardo Penabad

Hi again Marco,

to help speeding up things while we set up the JIRA ticket to follow up your questions, I will give you some additional details here.

If I understand correctly, one of your questions is what software you can use to manipulate SEAS GRIB files coming from C3S.

My suggestion would be for you to investigate if any of the following can be helpful:

Hi Eduardo,

thank you very much for your prompt answer, I will go thorougly into them.

Best,
Marco

@Matthew_Wiggins I think that is an iris issue. I came across this because I was getting the same error. I put in a bug report (Iris won't load netCDF files from the new CDS-beta · Issue #6149 · SciTools/iris · GitHub). If you change that line

total_bytes = cf_var.size * cf_var.dtype.itemsize

to

total_bytes = cf_var.size * np.dtype(cf_var.dtype).itemsize

It loads fine.

The difference is cause because the new files have variables that look like below (from ncdump -h), but the old files wouldn’t have had the “string” in front of the string variables. I don’t know if that is something you want to change in the converter or not.

double latitude(latitude) ;
    latitude:_FillValue = NaN ;
    string latitude:units = "degrees_north" ;
    string latitude:standard_name = "latitude" ;
    string latitude:long_name = "latitude" ;
    string latitude:stored_direction = "decreasing" ;

Thank you for this documentation.
What does the number dimension represent in the netcdf file ? Could you add details about this field in the documentation please ?

Hi Kevin,
Thank you for the info! This does seem to be the problem in my case, since my request was for July and August data of this year, including July 1st, which causes this issue since it combines ERA5 and ERA5T data. When I download the period before that, or just August and September, all variables are present (although all files still seem to have the expver dimension). Your suggested workaround makes sense in this case, however, I would also like to know if there is a plan to maybe change the converter to enable ERA5 and ERA5T data being downloaded together, and thereby reduce the number of requests being sent?

Hi Michela,

Thank you very much for the documentation, and the legacy converter option. I would just like some clarification regarding the legacy option, is it intended to stay for longer or is there a decommissioning date set?

Hi Jelena,
We have made the CDS team aware of the issue and they are investigating possible solutions. We do not have an exact timescale, but I suspect there will not be any changes in the next 2-3 months,
Thanks
Kevin

1 Like

2024-09-28_00
2024-09-28_01

before the “no discernable x coordinate” issue is solved, might as well declare the nc format in the download tab unusable

Thanks appreciate the reply. I’ll await the next iris version with the hope of a fix. Until then netcdf_legacy conversion it is.

The time coordinate (now called date for no reason) has no units.

    int64 date(date) ;
            date:long_name = "original GRIB coordinate for key: date(date)" ;
            date:units = "1" ;

Units are good.

Hi, all. I saw an additional change a few days ago that was unexpected (to me).

We’ve been using the new CDS for a while, downloading netCDF files for ERA5 single-level and on-pressure-level data sets, one variable per request, for one day of data at a time. E.g., the single-level 10m_u_component_of_wind variable.

Some time around Friday October 12, the netCDF files we were getting back changed format. It looks like the exper variable was no longer present, and a new scalar surface variable appeared. First place I noticed this was in a download for the 2024-10-04 data for 10m_u_component_of_wind.

Was this expected? I don’t recall seeing a surface variable mentioned anywhere. And is that expver metadata variable not going to reliably be present?

Here’s what I see in a couple example files, shown with ncinfo from the Python/Anaconda package netcdf4 version 1.6.4.

An old file from early last week, downloaded about Wednesday Oct 9 19:00 UTC, which has expver and not surface:

$ ncinfo single_level/daily/10m_u_component_of_wind/10m_u_component_of_wind\ -\ SL\ -\ 2024-10-03.nc
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    GRIB_centre: ecmf
    GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre: 0
    Conventions: CF-1.7
    institution: European Centre for Medium-Range Weather Forecasts
    history: 2024-10-09T20:08 GRIB to CDM+CF via cfgrib-0.9.14.1/ecCodes-2.36.0 with {"source": "data.grib", "filter_by_keys": {"stream": ["oper"]}, "encode_cf": ["parameter", "time", "geography", "vertical"]}
    dimensions(sizes): valid_time(24), latitude(721), longitude(1440)
    variables(dimensions): int64 number(), int64 valid_time(valid_time), float64 latitude(latitude), float64 longitude(longitude), <class 'str'> expver(valid_time), float32 u10(valid_time, latitude, longitude)
    groups:

A new file from late last week, downloaded about Thursday Oct 10 20:15 UTC:

(py311) apj@lx-prod-01:/dmx/big-vendor-files/BigVendorFiles/ERA5/v2/data$ ncinfo single_level/daily/10m_u_component_of_wind/10m_u_component_of_wind\ -\ SL\ -\ 2024-10-04.nc
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    GRIB_centre: ecmf
    GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre: 0
    Conventions: CF-1.7
    institution: European Centre for Medium-Range Weather Forecasts
    history: 2024-10-10T21:22 GRIB to CDM+CF via cfgrib-0.9.14.1/ecCodes-2.36.0 with {"source": "data.grib", "filter_by_keys": {}, "encode_cf": ["parameter", "time", "geography", "vertical"]}
    dimensions(sizes): valid_time(24), latitude(721), longitude(1440)
    variables(dimensions): int64 number(), int64 valid_time(valid_time), float64 surface(), float64 latitude(latitude), float64 longitude(longitude), float32 u10(valid_time, latitude, longitude)
    groups:
$ ls -l single_level/daily/10m_u_component_of_wind/10m_u_component_of_wind\ -\ SL\ -\ 2024-10-03.nc
-rwxrwxrwx 1 *** *** 48290304 Oct  9 15:09 'single_level/daily/10m_u_component_of_wind/10m_u_component_of_wind - SL - 2024-10-03.nc'
$ ls -l $MYSTASH/data/10m_u_component_of_wind\ -\ SL\ -\ 2024-10-04.nc
-rwxrwxrwx 1 *** *** 48626294 Oct 10 16:23 '[...]/data/10m_u_component_of_wind - SL - 2024-10-04.nc'

These files are the data returned from CDS API requests, and saved unmodified. We did the requests using the cdsapi Python package, version 0.7.3 from conda-forge, on Ubuntu Linux. User, group, and full paths above redacted for privacy.

Easy workaround, I think. Just wondering if I can expect further format changes, whether this change might indicate a problem, and whether we can rely on expver being present if we want to make use of it.

Cheers,
Andrew