Retrieve daily ERA5/ERA5-Land data using the CDS API

hi Niclas,

I suspect this is mainly due to the data being re-interpolated; CDS ERA5 atmospheric data are stored on a 0.25 degree grid, so even though you request them at a lower resolution, the data have to be re-interpolated to the 1.0 degree grid, while the 0.25 degree data are delivered 'direct' from the CDS data store,

Thanks,

Kevin

I have a problem with getting the reanalysis mean daily Surface solar radiation downwards, 1 hourly, 0.25/0.25. For January 2021 I get a variable called rsds in w m-2, whilst for the rest of the year I get a variable called rsds_accumulated and in j m-2. from February 2021 the variable is "integral_wrt_time_of_surface_downwelling_shortwave_flux_in_air" which is the sum of the values over a day. Why does the January data differ? I checked other years with the same problem.

Any feedback would be appreciated.

Thank you,

Cristian

Hi Cristian Gudasz ,

Apologies for this, we recently changed how accumulated variables were handled in the toolbox and applications. Previously they were converted to rate values (rsds), however this proved problematic as the conversions were not consistent accross datasets and resulted in data differering to what was documented on the catalogue entry page. We have removed all the conversions such that now you get the data as it is described on the catalogue entry, which in this case is the accumulated radiation (rsds_accumulated) during the model timestep.

The problem you experienced here was that you were receiving an old, cached, result from before we made this change for the January request. I have removed this result (and others like it) from our cache such that when you submit the request again you will get the accumulated data. Note that they are equivalent, just different ways of representing the same quantity, and I apologise again for any inconvenience/confusion caused.

Also, if you want to continue using the rate type data (which I personally prefer), you should request the variable:

'Mean surface downward short-wave radiation flux'

All the variable descriptions can be found here, and the daily stats app now serves the data as described in this table:

https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview

Thanks,

Eddy

Hi Eddy,

Thank you for the clarification! That really helps but downloading the January file, the problem is still there.

In the daily widget you can chose chose to download the mean of 1 hourly daily mean of the Mean surface downward short-wave radiation flux as well as the 1 hourly daily mean of the surface downward short-wave radiation flux. Since a mean here signifies in this case an accumulated quantity over a period, (in this case a day) both are producing the same results. Did I get that right?

Looking at the same year and month in the Land dataset with 0.1/0.1 grid and download the daily mean for the same rsds parameter the value is rsds_accumulated in J m-2. This would be the sum over the whole day. Dividing to 86400 to get w m-2 should be about the same values as in the example above, but they are not. They are about 50% lower. Any thoughts why this is?

Thank you again!
Cristian

For the first question regarding the ERA5-single-levels data:

The "Mean surface downward short-wave radiation flux" is in units of W m-2 (= J s-1 m-2) therefore this is a rate variable, so the daily mean of this will be the mean rate for the day.

The "Surface solar radiation downwards" is in units of J m-2 therefore this is an accumulated variable. It is the accumulation during the model time-step, therefore the daily mean will be the mean hourly accumulation of radiation... which could also be considered a rate variable with units J hour-1 m-2. This potential for confusion is why I prefer the mean rate variables, these accumulated variables are very useful in some instances (e.g. estimating energy transfer), but for a general understanding of the state of the climate they are a bit confusing.

For the second question regarding the ERA5 land data, this is a bug in the application, it does not correctly handle the daily accumulated variables which ERA5 land produces. As a temporary solution I have disabled the variables which are not being handled correctly. I have informed the application developers of this and a correct handling of the accumulted vairables will be made available soon. Note this only effects the ERA5 land data, for the other ERA5 single levels and ERA5 pressure levels only accumulate for the model time step, therefore the daily aggregation is correct.

Thanks,

Eddy

I'm having trouble trying to get Daily Maximum 2m temperature data for February 1958, with both ERA5/ERA5-Land at any geographical coordinates (using the daily application). It won´t download, it remains queued permanently. I didn´t have this problem getting Daily Minimum data for February 1958, or any variable for any other time period.

So I tried getting Daily Mean 2m temperature data from the 'ERA5-Land hourly data from 1950 to present' dataset, and while it did download, data for February 27th, 1958 is missing, shown as 'NaN'

Is it an error that Daily Maximum/Mean temperature data from this particular day is missing on both the application and the dataset?


Thanks



Hi Fred,

I can see data for 27th February 1958 everywhere except the sea as expected in ERA5-Land.

Thanks

Michela

Hi all,

I am trying to download ERA5-Land data (mean, max, min daily t2m) using the code above.
I get an error when I change the time_zone parameter to UTC+01:00 instead of UTC+00:00 as default.

I started to download data from January 2001 onward, and when my script reaches October 2002 I get the following error:


2022-03-17 09:33:10,559 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/tasks/services/tool/toolbox/orchestrator/workflow/clientid-1cc7f1fe6b8e46bc8d663ba4f83ac462
2022-03-17 09:33:11,243 INFO Request is queued
2022-03-17 09:33:12,432 INFO Request is running
2022-03-17 09:33:14,126 INFO Request is failed
2022-03-17 09:33:14,128 ERROR Message: 
2022-03-17 09:33:14,130 ERROR Reason:  Traceback (most recent call last):
  File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 59, in handle_request
    result = cached(context.method, proc, context, context.args, context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
    result = proc(context, *context.args, **context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 124, in __call__
    return p(*args, **kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 60, in __call__
    return self.proc(context, *args, **kwargs)
  File "/home/cds/cdsservices/services/python_service.py", line 38, in execute
    raise exceptions.InternalError(logging + traceback, '')
cdsclient.exceptions.InternalError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 460, in _dataset_concat
    vars = ensure_common_dims([ds.variables[k] for ds in datasets])
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 460, in <listcomp>
    vars = ensure_common_dims([ds.variables[k] for ds in datasets])
  File "/usr/local/lib/python3.6/site-packages/xarray/core/utils.py", line 426, in __getitem__
    return self.mapping[key]
KeyError: 'experimentVersionNumber'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/cdstoolbox/jsonrequest/jsonrequest/requests.py", line 71, in jsonrequestcall
    resp = coding.encode(req.callable(*req.args, **req.kwargs), register=encoders, **context)
  File "/opt/cdstoolbox/cdstools/cdstools/util.py", line 854, in concat
    return xr.concat(data, dim, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 192, in concat
    objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 527, in _dataarray_concat
    combine_attrs="drop",
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 462, in _dataset_concat
    raise ValueError("%r is not present in all datasets." % k)
ValueError: 'experimentVersionNumber' is not present in all datasets.
2022-03-17 09:33:14,132 ERROR   Traceback (most recent call last):
2022-03-17 09:33:14,134 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 59, in handle_request
2022-03-17 09:33:14,136 ERROR       result = cached(context.method, proc, context, context.args, context.kwargs)
2022-03-17 09:33:14,137 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
2022-03-17 09:33:14,138 ERROR       result = proc(context, *context.args, **context.kwargs)
2022-03-17 09:33:14,139 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 124, in __call__
2022-03-17 09:33:14,140 ERROR       return p(*args, **kwargs)
2022-03-17 09:33:14,140 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 60, in __call__
2022-03-17 09:33:14,141 ERROR       return self.proc(context, *args, **kwargs)
2022-03-17 09:33:14,142 ERROR     File "/home/cds/cdsservices/services/workflow.py", line 35, in execute
2022-03-17 09:33:14,143 ERROR       raise exceptions.CDSException(True, True, logging + traceback, '', uri)
2022-03-17 09:33:14,143 ERROR   cdsclient.exceptions.CDSException: Traceback (most recent call last):
2022-03-17 09:33:14,144 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 59, in handle_request
2022-03-17 09:33:14,145 ERROR       result = cached(context.method, proc, context, context.args, context.kwargs)
2022-03-17 09:33:14,145 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
2022-03-17 09:33:14,146 ERROR       result = proc(context, *context.args, **context.kwargs)
2022-03-17 09:33:14,147 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 124, in __call__
2022-03-17 09:33:14,147 ERROR       return p(*args, **kwargs)
2022-03-17 09:33:14,148 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 60, in __call__
2022-03-17 09:33:14,149 ERROR       return self.proc(context, *args, **kwargs)
2022-03-17 09:33:14,150 ERROR     File "/home/cds/cdsservices/services/python_service.py", line 38, in execute
2022-03-17 09:33:14,150 ERROR       raise exceptions.InternalError(logging + traceback, '')
2022-03-17 09:33:14,151 ERROR   cdsclient.exceptions.InternalError: Traceback (most recent call last):
2022-03-17 09:33:14,152 ERROR     File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 460, in _dataset_concat
2022-03-17 09:33:14,152 ERROR       vars = ensure_common_dims([ds.variables[k] for ds in datasets])
2022-03-17 09:33:14,160 ERROR     File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 460, in <listcomp>
2022-03-17 09:33:14,161 ERROR       vars = ensure_common_dims([ds.variables[k] for ds in datasets])
2022-03-17 09:33:14,161 ERROR     File "/usr/local/lib/python3.6/site-packages/xarray/core/utils.py", line 426, in __getitem__
2022-03-17 09:33:14,162 ERROR       return self.mapping[key]
2022-03-17 09:33:14,163 ERROR   KeyError: 'experimentVersionNumber'

Traceback (most recent call last):

  File "<ipython-input-5-e88eba735741>", line 1, in <module>
    runfile('/path/to/my/script.py', wdir='/path/to/my/wdir/')

  File "/home/guido/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "/home/guido/anaconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/path/to/my/script.pyF", line 60, in <module>
    "workflow_name": "application"

  File "/home/guido/anaconda3/lib/python3.7/site-packages/cdsapi/api.py", line 319, in service
    result = self._api('%s/tasks/services/%s/clientid-%s' % (self.url, name, uuid.uuid4().hex), request, 'PUT')

  File "/home/guido/anaconda3/lib/python3.7/site-packages/cdsapi/api.py", line 420, in _api
    raise Exception("%s. %s." % (reply['error'].get('message'), reply['error'].get('reason')))

Exception: . Traceback (most recent call last):
  File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 59, in handle_request
    result = cached(context.method, proc, context, context.args, context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
    result = proc(context, *context.args, **context.kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 124, in __call__
    return p(*args, **kwargs)
  File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 60, in __call__
    return self.proc(context, *args, **kwargs)
  File "/home/cds/cdsservices/services/python_service.py", line 38, in execute
    raise exceptions.InternalError(logging + traceback, '')
cdsclient.exceptions.InternalError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 460, in _dataset_concat
    vars = ensure_common_dims([ds.variables[k] for ds in datasets])
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 460, in <listcomp>
    vars = ensure_common_dims([ds.variables[k] for ds in datasets])
  File "/usr/local/lib/python3.6/site-packages/xarray/core/utils.py", line 426, in __getitem__
    return self.mapping[key]
KeyError: 'experimentVersionNumber'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/cdstoolbox/jsonrequest/jsonrequest/requests.py", line 71, in jsonrequestcall
    resp = coding.encode(req.callable(*req.args, **req.kwargs), register=encoders, **context)
  File "/opt/cdstoolbox/cdstools/cdstools/util.py", line 854, in concat
    return xr.concat(data, dim, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 192, in concat
    objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 527, in _dataarray_concat
    combine_attrs="drop",
  File "/usr/local/lib/python3.6/site-packages/xarray/core/concat.py", line 462, in _dataset_concat
    raise ValueError("%r is not present in all datasets." % k)
ValueError: 'experimentVersionNumber' is not present in all datasets..


Thank you for your support


Hi Guido,

looks like this may have been an internal CDS issue related to the internal representation of these data in the CDS, but hopefully that is resolved now. Are you still seeing this error message?

Kevin

Dear Kevin (or others),

I have been using this very useful script to retrieve daily ERA5 data at 0.5 degree resolution, but I have a few questions.

Whatever I fill in for the extent ( either "area": {"lat": [-89.75, 89.75], "lon": [-179.75, 179.75]} or "area": {"lat": [-90, 90], "lon": [-180, 180]}), I only manage to get gridcells with coordinates ending with .0 and 0.5. Is there a way to get the lon and lat coordinates ending at 0.25 and 0.75 (representing their midpoints, and resulting in a grid with exactly 720 lon and 360 lat values)?

Do I understand correctly that to get daily precipitation values, I need to multiply the final result with 24?

Does the script include the shift time of -1 hour?

Thanks! Hester

This script is extremely useful and fills a serious gap in acquiring daily-average ERA5 programmatically! I have one issue though. When I specify an area  that isn't global, the area cropped out is simply replaced with NaNs, which take up a lot of useless memory if you're cropping to a small region.

I see L712-718 of the app-c3s-daily-era5-statistics  code implements a ct.cube.select  call to crop to the desired region, not sure if that's being called or what's happening: https://cds.climate.copernicus.eu/cdsapp#!/software/app-c3s-daily-era5-statistics?tab=appcode

I'll implement a manual workaround to crop out the NaNs, but keen to hear if this can be avoided.

Hi Tom,  i tried a request for

"area": {"lat": [0, 10], "lon": [10, 20]} 

and it worked as expected - returned files were just for the area selected:

Can you give an example of a request whihc gives NAN's, please?

Kevin

Hi Kevin, thank you for your swift response. Given that it worked for you, I decided to test this again this morning. I'm now observing the correct behaviour that you've reported here. I can't think of anything that's changed with my code, so this might just remain a mystery!

Cheers

Glad it is working for you, Tom!

Kevin

Hi Kevin, I'm finding that pressure level requests will queue for hours and then fail with the error `cdsworkflows.error.ClientError: None`. I attach an MWE script (requesting daily average geopotential height at 250 hPa) below that should trigger the error. My hypothesis is that the tool will attempt to compute the daily average over all pressure levels before slicing out the pressure level requested, but I'm not sure. Am I missing anything or doing anything wrong?

import cdsapi
import requests
import time
import numpy as np

def download_era5_daily_avg_month(download_fpath):
c = cdsapi.Client(timeout=600, retry_max=1000, quiet=False, debug=True)

print(f"\n\n\nDownloading reanalysis data...\n\n")
tic = time.time()

result = c.service(
    "tool.toolbox.orchestrator.workflow",
    params={
        "realm": "user-apps",
        "project": "app-c3s-daily-era5-statistics",
        "version": "master",
        "kwargs": {
            "dataset": 'reanalysis-era5-pressure-levels',
            "product_type": "reanalysis",
            "variable": "geopotential",
            "statistic": "daily_mean",
            "year": 1959,
            "month": '01',
            "time_zone": "UTC+00:0",
            "frequency": "1-hourly",
            "grid": "0.25/0.25",
            "pressure_level": "250",
        },
        "workflow_name": "application"
    })

location = result[0]['location']
res = requests.get(location, stream=True)
print("Writing data to " + download_fpath)
with open(download_fpath, 'wb') as fh:
    for r in res.iter_content(chunk_size=1024):
        fh.write(r)
fh.close()

dur = time.time() - tic
print(f"\n\nFile downloaded to {download_fpath}\nDone in {np.floor(dur / 60)}m:{dur % 60:.0f}s.\n\n")

download_era5_daily_avg_month(‘foo.nc’)

Hi Tom,

Sorry for not getting back to you sooner -it looks like there was an issue with the daily app for pre - 1979 ERA5 pressure level data, but this should be fixed now (I did some tests this morning and it worked ok!),

Thanks,

Kevin

Seems to be working fine now, thanks very much Kevin et al!

Hi Kevin,

I am using "area":{"lat": [-90, 90], "lon": [-180, 180] and "grid": "0.25/0.25" as parameter for downloading high cloud cover value. The latitude in the output file ranges from -90 to +90 whereas the longitude ranges from -180 to 179.75 only. 

What might be the reason why +180 long is not displayed in the data? Am I missing something here? 

hi Rabin, 

the point at -180 degrees is the same location as +180 degrees, hence only one of these is needed i.e. -180 to +179.75 gives a complete global grid at 0.25 degree resolution,

Thanks,

Kevin

Hi, all. I used the scripts (thank to Kevin Marsh ) and only made a modification in

" ”realm“ :“c3s" " .

It's ok when I downloaded "2m_temperature".

But when I downloaded "surface_net_solar_radiation",  the scripts can not go on and the

error is as below:

" NameError: name 'reduce' is not defined "

I only change the name the variable.


Enviroment : Anaconda 3 + python 3.9.2 


As a try, I add a line in the scripts "from functools import reduce", But the problem is still there. 


Similar problem are also reported recent days Re: "NameError: name 'reduce' is not defined" occurred when download ERA5-Land daily statistics


Does anyone can give some instruction?

ERA5_daily.py

Thanks!