Has there been a change in atmospheric data on ERA5 hourly pressure levels between the old ERA5 system with netcdf data since the migration to the new CDS?
I have 2 files (one downloaded in 2023, and one downloaded this week for identical regions and variables)
To make the file formats match I ran this script to make the new ERA5 on hourly levels match the old format:
import xarray as xr
# Open the slow file
ds = xr.open_dataset('ERA5-2022-WH.nc')
if 'expver' in ds.coords:
print("Expver and number present, removing")
ds = ds.drop_vars('expver')
ds = ds.drop_vars('number')
ds = ds.rename({'valid_time': 'time', 'pressure_level': 'level'})
ds = ds.reindex(level=ds.level[::-1])
# Save to a new file
ds.to_netcdf('/mnt/d/FORECASTS/optimized_ERA5-2022-WH.nc')
#Finally, from Command line then run this (which converts from netcdf4 to netcdf3, which is faster for processing for some reason
#ncks -6 optimized_ERA5-2022-WH.nc optimized_ERA5-2022-WH.nc
Now the 2 data sets look like so:
Dimensions: (longitude: 301, latitude: 201, level: 7, time: 730)
Coordinates:
* longitude (longitude) float32 -125.0 -124.8 -124.5 ... -50.5 -50.25 -50.0
* latitude (latitude) float32 0.0 0.25 0.5 0.75 ... 49.25 49.5 49.75 50.0
* level (level) int32 20 30 50 70 100 125 150
* time (time) datetime64[ns] 2022-01-01 ... 2022-12-31T12:00:00
Data variables:
z (time, level, latitude, longitude) float32 ...
t (time, level, latitude, longitude) float32 ...
u (time, level, latitude, longitude) float32 ...
v (time, level, latitude, longitude) float32 ...
Attributes:
Conventions: CF-1.6
history: 2023-12-14 19:03:35 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...
<xarray.Dataset>
Dimensions: (latitude: 201, level: 7, longitude: 301, time: 730)
Coordinates:
* latitude (latitude) float64 0.0 0.25 0.5 0.75 ... 49.25 49.5 49.75 50.0
* level (level) float64 20.0 30.0 50.0 70.0 100.0 125.0 150.0
* longitude (longitude) float64 -125.0 -124.8 -124.5 ... -50.5 -50.25 -50.0
* time (time) datetime64[ns] 2022-01-01 ... 2022-12-31T12:00:00
Data variables:
u (time, level, latitude, longitude) float32 ...
v (time, level, latitude, longitude) float32 ...
z (time, level, latitude, longitude) float32 ...
Attributes:
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: Wed Jan 29 16:19:33 2025: ncks -6 /mnt/d/FORECAS...
NCO: netCDF Operators version 5.0.6 (Homepage = http:...
But if I run this script neither z,u, or v line up between the 2 datasets:
# Step 1: Find the overlapping coordinate ranges for longitude, latitude, level, time
overlap_lon = np.intersect1d(ds1.longitude.values, ds2.longitude.values)
overlap_lat = np.intersect1d(ds1.latitude.values, ds2.latitude.values)
overlap_level = np.intersect1d(ds1.level.values, ds2.level.values)
overlap_time = np.intersect1d(ds1.time.values, ds2.time.values)
print(overlap_lon)
print(overlap_lat)
print(overlap_time)
print(overlap_level)
# Step 2: Subset both datasets to the overlapping region
ds1_overlap = ds1.sel(longitude=overlap_lon, latitude=overlap_lat, level=overlap_level, time=overlap_time)
ds2_overlap = ds2.sel(longitude=overlap_lon, latitude=overlap_lat, level=overlap_level, time=overlap_time)
print(ds1_overlap)
print(ds2_overlap)
# Step 3: Compare variables (e.g., "z", "u", "v", etc.)
# For example, comparing variable 'z' from both datasets
z_equal = np.allclose(ds1_overlap.z.values, ds2_overlap.z.values, atol=1e-5)
print(f"Are the 'z' variable values the same in the overlapping region? {z_equal}")