Retrieve daily ERA5/ERA5-Land data using the CDS API

Kevin_Marsh · 29 March 2023 14:43

Hi Leo,

I think it was a temporary issue with the daily application on the CDS. The "realm" line should be

"realm": "user-apps" as in the example above.

This script works for me:

% more workflow.py

import cdsapi

import requests

# CDS API script to use CDS service to retrieve daily ERA5* variables and iterate over

# all months in the specified years.

# Requires:

# 1) the CDS API to be installed and working on your system

# 2) You have agreed to the ERA5 Licence (via the CDS web page)

# 3) Selection of required variable, daily statistic, etc

# Output:

# 1) separate netCDF file for chosen daily statistic/variable for each month

c = cdsapi.Client(timeout=300)

# Uncomment years as required

years = [

'1979'

# ,'1980', '1981',

# '1982', '1983', '1984',

# '1985', '1986', '1987',

# '1988', '1989', '1990',

# '1991', '1992', '1993',

# '1994', '1995', '1996',

# '1997', '1998', '1999',

# '2000', '2001', '2002',

# '2003', '2004', '2005',

# '2006', '2007', '2008',

# '2009', '2010', '2011',

# '2012', '2013', '2014',

# '2015', '2016', '2017',

# '2018', '2019', '2020',

# '2021'

]

# Retrieve all months for a given year.

months = ['01', '02', '03',

'04', '05', '06',

'07', '08', '09',

'10', '11', '12']

# For valid keywords, see Table 2 of:

# https://datastore.copernicus-climate.eu/documents/app-c3s-daily-era5-statistics/C3S_Application-Documentation_ERA5-daily-statistics-v2.pdf

# select your variable; name must be a valid ERA5 CDS API name.

var = "surface_net_solar_radiation"

# Select the required statistic, valid names given in link above

stat = "daily_mean"

# Loop over years and months

for yr in years:

for mn in months:

result = c.service(

"tool.toolbox.orchestrator.workflow",

params={

"realm": "user-apps",

"project": "app-c3s-daily-era5-statistics",

"version": "master",

"kwargs": {

"dataset": "reanalysis-era5-single-levels",

"product_type": "reanalysis",

"variable": var,

"statistic": stat,

"year": yr,

"month": mn,

"time_zone": "UTC+00:0",

"frequency": "1-hourly",

#

# Users can change the output grid resolution and selected area

#

# "grid": "1.0/1.0",

# "area":{"lat": [10, 60], "lon": [65, 140]}

},

"workflow_name": "application"

})

# set name of output file for each month (statistic, variable, year, month

file_name = "download_" + stat + "_" + var + "_" + yr + "_" + mn + ".nc"

location=result[0]['location']

res = requests.get(location, stream = True)

print("Writing data to " + file_name)

with open(file_name,'wb') as fh:

for r in res.iter_content(chunk_size = 1024):

fh.write(r)

fh.close()

which gives:

% python3 workflow.py

2023-03-29 14:39:23,123 INFO Welcome to the CDS

2023-03-29 14:39:23,123 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/tasks/services/tool/toolbox/orchestrator/workflow/clientid-ca92e3febe0b49d18d70984583f3a282

2023-03-29 14:39:23,194 INFO Request is queued

2023-03-29 14:53:44,946 INFO Request is running

2023-03-29 14:55:45,209 INFO Request is completed

Writing data to download_daily_mean_surface_net_solar_radiation_1979_01.nc

2023-03-29 14:56:08,137 INFO Welcome to the CDS

2023-03-29 14:56:08,137 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/tasks/services/tool/toolbox/orchestrator/workflow/clientid-0d0746c3abb8414894ce6abf2d263c53

2023-03-29 14:56:08,209 INFO Request is queued

2023-03-29 15:22:29,376 INFO Request is running

2023-03-29 15:24:29,630 INFO Request is completed

Writing data to download_daily_mean_surface_net_solar_radiation_1979_02.nc

..etc

(The ERA5-Land accumulated parameters have been removed from the app as it is fairly straightforward to derive these from the dataset)

Hope that helps,

Kevin

ke_yu · 19 September 2023 13:47

Hi Kevin

I'm curious why I cannot directly download the daily statistics for "total precipitation" in the ERA5-land dataset from the "https://cds.climate.copernicus.eu/apps/user-apps/app-c3s-daily-era5-statistics". Would it be poosible to add this feature in the future? It would greatly benefit users like me who are not proficient in Python.

Thanks!

Marco_Gortan · 21 September 2023 17:10

Hi!

Thanks a lot for sharing the API: it is very helpful for research in different fields.

In my case, I am trying to retrieve the 'daily_maximum' value for the variable '2m_temperature' from 1950 onwards at 0.1/0.1 grid level. Even though I manage to slowly download the data, I often have the error:[Errno 10054] An existing connection was forcibly closed by the remote host, which obliges to wait 120 second for the next request (the error does not pop up with 0.25/0.25 grid level). I have tried some methods like inserting a sleep timer but it seems to me that in any case the requests are running for a very long time. So my questions are:

- Is there indeed an issue with the 0.1 grid level?
- Are the data of the 0.1 grid level resampled from the 0.25 grid or are they source data?

Thanks!

Michelle_Irizarry_Or · 16 November 2023 23:41

Hi! I am having trouble running even the example as it is. It finishes downloading the first month but then it fails before going to the 2nd month. This was yesterday. Today can't even download the first month of data.

These are some of the error messages it threw. Can you help?
Thanks!

2023-11-16 18:26:37,983 INFO Request is failed
2023-11-16 18:26:37,983 ERROR Message: an internal error occurred processing your request
2023-11-16 18:26:37,984 ERROR Reason: Cmd('git') failed due to: exit code(128)
cmdline: git clone git@gitrepo:user-apps/app-c3s-daily-era5-statistics.git /home/cds/compute_workflows/user-apps/app-c3s-daily-era5-statistics/master
stdout: 'Cloning into '/home/cds/compute_workflows/user-apps/app-c3s-daily-era5-statistics/master'...'
stderr: 'Warning: Permanently added the RSA host key for IP address '192.168.0.248' to the list of known hosts.
GitLab: Failed to authorize your Git request: internal API unreachable
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.'

Kevin_Marsh · 4 February 2024 14:00

Hi Michelle,

are you still having issues when using the daily application?

Thanks,

Kevin

Joao_Macalos · 23 March 2024 15:56

Hi @Kevin_Marsh ,
Is it possible to disable the cache when using the daily application?

Kevin_Marsh · 23 March 2024 18:00

Hi @Joao_Macalos,
No, its not possible to disable the cache in this case (Note that requests for daily data from the application should not have been affected by the recent issues with the grib to netcdf conversion)
Thanks,
Kevin

Joao_Macalos · 24 March 2024 08:25

Hi @Kevin_Marsh,
Many thanks for the reply.

Do you know if it is possible to select the days we want to download from each month?

And how long do the files remain in the cache?

Thanks again,

Kevin_Marsh · 25 March 2024 12:14

Hi @Joao_Macalos,
No, currently it is only possible to download daily data for a complete month from the daily application.
The length of time the files remain in the cache depends on how many requests are being processed by the CDS. typically, it is of the order of a few days, so users are advised to download the data as soon as their request completes.
Thanks,
Kevin

Younjung_Kim · 1 August 2024 12:45

Hi,

I have modified the above code for my purpose. But, while I can see “INFO Request is completed” and shown as completed on the website, there is no file saved. May I ask what could be the reason?

Best regards,

Younjung

import cdsapi
import requests

# CDS API script to use CDS service to retrieve daily ERA5 variables and iterate over
# all months in the specified years.

# Requires
# 1) the CDS API to be installed and working on your system
# 2) You have agreed to the ERA5 Licence (via the CDS web page)
# 3) Selection of required variable, daily statistic, etc

# Output
# 1) separate netCDF file for chosen daily statistic variable for each month

c = cdsapi.Client()

# Uncomment years as required
years = ['2018', '2019', '2020', '2021', '2022', '2023']

# Retrieve all months for a given year.
months = ['01', '02', '03',
          '04', '05', '06',
          '07', '08', '09',
          '10', '11', '12']

# For valid keywords, see Table 2 of
# httpsdatastore.copernicus-climate.eudocumentsapp-c3s-daily-era5-statisticsC3S_Application-Documentation_ERA5-daily-statistics-v2.pdf

##################################################################################
# select your variable; name must be a valid ERA5 CDS API name.
var = "2m_temperature"

# Select the required statistic, valid names given in link above
stat = "daily_mean"

# List of locations with their latitude and longitude ranges
sites = [
  {"name": "Ganghwa", "lat": [37.63+0.1-1.1, 37.83-0.1+1.1], "lon": [126.30+0.1-1.1, 126.50-0.1+1.1]},
  {"name": "Pocheon", "lat": [37.81+0.1-1.1, 38.01-0.1+1.1], "lon": [127.09+0.1-1.1, 127.29-0.1+1.1]},
  {"name": "Gyeonggi_Gwangju", "lat": [37.33+0.1-1.1, 37.53-0.1+1.1], "lon": [127.23+0.1-1.1, 127.43-0.1+1.1]},
  {"name": "Inje", "lat": [37.98+0.1-1.1, 38.18-0.1+1.1], "lon": [128.06+0.1-1.1, 128.26-0.1+1.1]},
  {"name": "Samcheok", "lat": [37.35+0.1-1.1, 37.55-0.1+1.1], "lon": [128.96+0.1-1.1, 129.16-0.1+1.1]},
  {"name": "Chungju", "lat": [36.84+0.1-1.1, 37.04-0.1+1.1], "lon": [127.81+0.1-1.1, 128.01-0.1+1.1]},
  {"name": "Boeun", "lat": [36.48+0.1-1.1, 36.68-0.1+1.1], "lon": [127.69+0.1-1.1, 127.89-0.1+1.1]},
  {"name": "Boryeong", "lat": [36.35+0.1-1.1, 36.55-0.1+1.1], "lon": [126.55+0.1-1.1, 126.75-0.1+1.1]},
  {"name": "Nonsan", "lat": [36.12+0.1-1.1, 36.32-0.1+1.1], "lon": [127.19+0.1-1.1, 127.39-0.1+1.1]},
  {"name": "Dangjin", "lat": [36.74+0.1-1.1, 36.94-0.1+1.1], "lon": [126.57+0.1-1.1, 126.77-0.1+1.1]},
  {"name": "Gochang", "lat": [35.40+0.1-1.1, 35.60-0.1+1.1], "lon": [126.46+0.1-1.1, 126.66-0.1+1.1]},
  {"name": "Gokseong", "lat": [35.05+0.1-1.1, 35.25-0.1+1.1], "lon": [127.19+0.1-1.1, 127.39-0.1+1.1]},
  {"name": "Bosung", "lat": [34.81+0.1-1.1, 35.01-0.1+1.1], "lon": [127.04+0.1-1.1, 127.24-0.1+1.1]},
  {"name": "Gimcheon", "lat": [35.96+0.1-1.1, 36.16-0.1+1.1], "lon": [128.17+0.1-1.1, 128.37-0.1+1.1]},
  {"name": "Sangju", "lat": [36.49+0.1-1.1, 36.69-0.1+1.1], "lon": [127.81+0.1-1.1, 128.01-0.1+1.1]},
  {"name": "Andong", "lat": [36.53+0.1-1.1, 36.73-0.1+1.1], "lon": [128.51+0.1-1.1, 128.71-0.1+1.1]},
  {"name": "Ulju", "lat": [35.52+0.1-1.1, 35.72-0.1+1.1], "lon": [128.98+0.1-1.1, 129.18-0.1+1.1]},
  {"name": "Jinju", "lat": [35.20+0.1-1.1, 35.40-0.1+1.1], "lon": [127.93+0.1-1.1, 128.13-0.1+1.1]},
  {"name": "Jeju", "lat": [33.38+0.1-1.1, 33.58-0.1+1.1], "lon": [126.46+0.1-1.1, 126.66-0.1+1.1]}]


# Loop over sites, years, and months
for loc in sites:
    for yr in years:
        for mn in months:
            result = c.service(
                "tool.toolbox.orchestrator.workflow",
                params={
                    "realm": "user-apps",
                    "project": "app-c3s-daily-era5-statistics",
                    "version": "master",
                    "kwargs": {
                        "dataset": "reanalysis-era5-land",
                        "product_type": "reanalysis",
                        "variable": var,
                        "statistic": stat,
                        "year": yr,
                        "month": mn,
                        "time_zone": "UTC+00:0",
                        "frequency": "1-hourly",
                        "grid": "0.1/0.1",
                        "area": {"lat": loc["lat"], "lon": loc["lon"]}
                    },
                    "workflow_name": "application"
                }
            )

            # Set name of output file 
            file_name = "download_" + stat + "_" + var + "_" + loc["name"] + "_" + yr + "_" + mn + ".nc"

             location=result[0]['location']
        res = requests.get(location, stream = True)
        print("Writing data to " + file_name)
        with open(file_name,'wb') as fh:
            for r in res.iter_content(chunk_size = 1024):
                fh.write(r)
        fh.close()

Kevin_Marsh · 2 August 2024 14:33

I think the indentation at the end of the script is wrong. Should be:

           # Set name of output file 
            file_name = "download_" + stat + "_" + var + "_" + loc["name"] + "_" + yr + "_" + mn + ".nc"

            location=result[0]['location']
            res = requests.get(location, stream = True)
            print("Writing data to " + file_name)
            with open(file_name,'wb') as fh:
                for r in res.iter_content(chunk_size = 1024):
                fh.write(r)
            fh.close()

[/quote]

NS_27 · 15 August 2024 10:57

Hello,
I am trying to retrive Geopotential for a specific timing and location. Other parameters are correctly being retrieved with the 137 levels. But for the parameter Geopotential there are only 3 coords witin the nc file, level is not one of them. I am sure that the request is correct. Is there an issue with the database? Please let me know.

Niclas_Rieger1 · 13 September 2024 14:24

Will the script for downloading daily statistics also work with the new CDS-beta API?