Retrievals freeze several times per day

I am trying to retrieve ERA5-Land data. However, the retrievals either freeze up or crash (see my previous post on random error) several times per day. For example this retrieval was stuck for two hours until I interrupted it.

When will there be any improvements/ bug fixes? Is there a way to contact the service provider, it seems that only other users are active in the forum? Apologies for being a bit negative, but it’s quite annoying to have to babysit the retrievals day and night.

2024-08-22 18:13:41,011 INFO Request ID is 080625dd-942c-46e8-bd3c-8378f03c84f9                            
2024-08-22 18:13:41,079 INFO status has been updated to accepted

Dear Leif_Bjarne_Backman,

Sorry to hear about what you have mentioned. To support better, could you raise a ticket at https://support.ecmwf.int? It would be useful if you can share your script with us.

Thank you,
Xiaobo

Thanks, I will l do that.
The script

import cdsapi
import numpy as np

vars_are = [
    '2m_temperature',
    'total_precipitation',
    'surface_solar_radiation_downwards',
    '2m_dewpoint_temperature',
    'surface_thermal_radiation_downwards',
    '10m_u_component_of_wind',
    '10m_v_component_of_wind'
]

yrs_are  = [
    #1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959,
    #1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969,
    #1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979,
    #1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 
    #1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
    #2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
    #2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019,
    2020, 2021, 2022, 2023
]

months_are = [
    '01', '02', '03', '04',
    '05', '06', '07', '08',
    '09', '10', '11', '12'
]

days_are = [
    '01', '02', '03',
    '04', '05', '06',
    '07', '08', '09',
    '10', '11', '12',
    '13', '14', '15',
    '16', '17', '18',
    '19', '20', '21',
    '22', '23', '24',
    '25', '26', '27',
    '28', '29', '30',
    '31'
]

times_are = [
    '00:00', '01:00', '02:00',
    '03:00', '04:00', '05:00',
    '06:00', '07:00', '08:00',
    '09:00', '10:00', '11:00',
    '12:00', '13:00', '14:00',
    '15:00', '16:00', '17:00',
    '18:00', '19:00', '20:00',
    '21:00', '22:00', '23:00'
]

lon_is=1.9
lat_is=48.5

dataset = "reanalysis-era5-land"
client = cdsapi.Client()

# retrieve data
for nm in np.arange(len(vars_are)):
    var_is = str(vars_are[nm])
    for yr in np.arange(len(yrs_are)):
        year_is = str(yrs_are[yr])
        for mo in np.arange(len(months_are)):
            month_is = str(months_are[mo])

            request = {
                'variable': var_is,
                'year': year_is,
                'month': month_is,
                'day': days_are,
                'time': times_are,
                'data_format': 'netcdf',
                'download_format': 'unarchived',
                'area': [
                    lat_is+0.6, lon_is, lat_is, lon_is+0.9
                ]
            }
            target = 'meteo/ERA5_{var}_{year}_{month}.nc'.format(var=var_is, year=year_is, month=month_is)
            client.retrieve(dataset, request, target)

A quick look at the script triggered some suggestions for efficiency improvement:

  • You should not loop over parameters, all parameters should be retrieved in one request
  • For daily data, our recommendation is to retrieve one month of data with one request; for monthly averaged data, one year per request
  • I also noticed that you did not speficy ‘grid’

Thanks!

  • Good to know that it’s possible to retrieve several parameters in one request. For a long time there was a limitation of 1000 fields per request, which meant that only one parameter could be retrieved in one request (with hourly data and one month of data).
  • Your recommendation doesn’t cover hourly data, only monthly and daily?
  • The API request produced by the form (ERA5-Land: Copernicus Climate Data Store | Copernicus Climate Data Store
    doesn’t include any definition of the grid. How should it be defined? I define the area.

import cdsapi

c = cdsapi.Client()

c.retrieve(
‘reanalysis-era5-land’,
{
‘variable’: ‘2m_temperature’,
‘year’: ‘1970’,
‘month’: ‘01’,
‘day’: [
‘01’, ‘02’, ‘03’,
‘04’, ‘05’, ‘06’,
‘07’, ‘08’, ‘09’,
‘10’, ‘11’, ‘12’,
‘13’, ‘14’, ‘15’,
‘16’, ‘17’, ‘18’,
‘19’, ‘20’, ‘21’,
‘22’, ‘23’, ‘24’,
‘25’, ‘26’, ‘27’,
‘28’, ‘29’, ‘30’,
‘31’,
],
‘time’: [
‘00:00’, ‘01:00’, ‘02:00’,
‘03:00’, ‘04:00’, ‘05:00’,
‘06:00’, ‘07:00’, ‘08:00’,
‘09:00’, ‘10:00’, ‘11:00’,
‘12:00’, ‘13:00’, ‘14:00’,
‘15:00’, ‘16:00’, ‘17:00’,
‘18:00’, ‘19:00’, ‘20:00’,
‘21:00’, ‘22:00’, ‘23:00’,
],
‘area’: [
49.1, 1.9, 48.5,
2.8,
],
‘format’: ‘netcdf’,
},
‘download.nc’)