CMIP6 - NetCDF file with Inconsistent variable definition for lat_bnds!

I'm downloading daily CMIP6 data for max air temperature. I've asked for a spatial subset covering Brazil. When I open the NetCDF file in Panoply, I get very strange image, with the values from the most eastern border interpolating into the western border and wrapping around the globe.

And if I use CDO to query the file information, I get an error:

cdo sinfo
Warning (cdf_set_var): Inconsistent variable definition for lat_bnds!
Warning (cdf_set_var): Inconsistent variable definition for lon_bnds!
Warning (cdf_set_var): Inconsistent variable definition for time_bnds!
Segmentation fault (core dumped)

I tried both MIROC and GFDL models, downloading from the site or using the Python API.

I just checked and if I don't do a spatial subset, the CDO comand works fine. So I believe that the spatial subsetting is generating a malformed NetCDF file.

Has this happened to anyone else?

I just found out that ncinfo works on the file. But when I use ncdump to look at the lat_bnds variable, I get what appears to be an infinite dump of the latitude values. Meanwhile, the same ncdump for the lat_bnds variable in the NON-subsetted dataset works fine.

Here is the ncinfo output on the offending file

m330625@desk7802:~/geodb/cmip6$ ncinfo
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
external_variables: areacella
history: File was processed by fremetar (GFDL analog of CMOR). TripleID: [exper_id_FlJGh4Wo6W,realiz_id_kt2pvOSbWt,run_id_1S546GMKbs]
table_id: day
activity_id: ScenarioMIP
branch_method: standard
branch_time_in_child: 60225.0
branch_time_in_parent: 60225.0
comment: <null ref>
Conventions: CF-1.7 CMIP-6.0 UGRID-1.0
creation_date: 2019-06-19T01:17:16Z
data_specs_version: 01.00.27
experiment: update of RCP8.5 based on SSP5
experiment_id: ssp585
forcing_index: 1
frequency: day
grid: atmos data regridded from Cubed-sphere (c96) to 180,288; interpolation method: conserve_order2
grid_label: gr1
initialization_index: 1
institution: National Oceanic and Atmospheric Administration, Geophysical Fluid Dynamics Laboratory, Princeton, NJ 08540, USA
institution_id: NOAA-GFDL
license: CMIP6 model data produced by NOAA-GFDL is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License ( Consult for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
mip_era: CMIP6
nominal_resolution: 100 km
parent_activity_id: CMIP
parent_experiment_id: historical
parent_mip_era: CMIP6
parent_source_id: GFDL-ESM4
parent_time_units: days since 1850-1-1
parent_variant_label: r1i1p1f1
physics_index: 1
product: model-output
realization_index: 1
realm: atmos
source: GFDL-ESM4 (2018):
atmos: GFDL-AM4.1 (Cubed-sphere (c96) - 1 degree nominal horizontal resolution; 360 x 180 longitude/latitude; 49 levels; top level 1 Pa)
ocean: GFDL-OM4p5 (GFDL-MOM6, tripolar - nominal 0.5 deg; 720 x 576 longitude/latitude; 75 levels; top grid cell 0-2 m)
seaIce: GFDL-SIM4p5 (GFDL-SIS2.0, tripolar - nominal 0.5 deg; 720 x 576 longitude/latitude; 5 layers; 5 thickness categories)
land: GFDL-LM4.1
aerosol: interactive
atmosChem: GFDL-ATMCHEM4.1 (full atmospheric chemistry)
ocnBgchem: GFDL-COBALTv2
landIce: GFDL-LM4.1
(GFDL ID: 2019_0301)
source_id: GFDL-ESM4
source_type: AOGCM AER CHEM BGC
sub_experiment: none
sub_experiment_id: none
title: NOAA GFDL GFDL-ESM4 model output prepared for CMIP6 update of RCP8.5 based on SSP5
tracking_id: hdl:21.14100/67242666-b902-45dd-96fa-64875db6a5bb
variable_id: tasmax
variant_info: N/A
references: see further_info_url attribute
variant_label: r1i1p1f1
dimensions(sizes): bnds(2), lat(39), time(31390), lon(32)
variables(dimensions): float64 bnds(bnds), float64 height(), float64 lat(lat), float64 lat_bnds(time,lat,bnds), float64 lon(lon), float64 lon_bnds(time,lon,bnds), float32 tasmax(time,lat,lon), int64 time(time), float64 time_bnds(time,bnds)

And here is the ncinfo output of the variable with inconsistent definition

m330625@desk7802:~/geodb/cmip6$ ncinfo -v lat_bnds
<class 'netCDF4._netCDF4.Variable'>
float64 lat_bnds(time, lat, bnds)
_FillValue: nan
long_name: latitude bounds
coordinates: height
unlimited dimensions:
current shape = (31390, 39, 2)
filling on

Found out that I can clean the file using nccopy command and extracting just the lat,lon,time,tasmax variables to a new file.

nccopy -v lat,lon,time,tasmax

 But I'm not sure what I'm missing when I trow away the lat_bnds, lon_bnds, time_bnds variable

Glad to find this thread. I thought someone else must have wondered why the time dimension is in lat_bnds and lon_bnds, and how to correct it. I took a similar approach to remove and replace the bounds so cdo functions work:

ncks -C -O -x -v lat_bnds,lon_bnds,time_bnds

ncap2 -O -s 'defdim("bnds",2); lon_bnds=make_bounds(lon,$bnds); lat_bnds=make_bounds(lat,$bnds); time_bnds=make_bounds(time,$bnds)'

My attempt to use nccopy did not work. Something was left behind. Wish I new how to use ncks  and ncap2 so I could fix the file in an easier way. I ended up writing an R scritpt that will rewrite the NetCDF without the bnds.

If anyone is interested, here goes the script:

I deleted the dimension "bnds" using the following code:

$ ncwa -a bnds

And CDO worked perfectly fine afterwards, since the has regular time,lev,lat,lon dimensions.

However, "nccopy" did not work for me.