Hello,
I am attempting to download 2D wave spectrum for a single Lat/Long. I am attempting to download ~5 years of data, and my query sends requests for one month at a time (as instructed). It takes nearly 24 hours to download a single month, an NC file of nearly 1 MB. The status updates from accepted to running within seconds, but then it is the download that takes ages. It can be difficult to maintain a connection for this long.
This has become a serious obstacle to our work: if there is a way to expedite this sort of download, including paying for an expedited service, please let me know.
Hi, here’s something that helps us download data faster. I don’t know how 1MB file can take 24 hours to download but if downloading is taking much longer than CDS generating the file, you can try to parallelize download using tools such as “aria2” (https://aria2.github.io/) instead of relying on cdsapi to do it.
Something like this:
import cdsapi
import subprocess
client = cdsapi.client()
DATASET = "some-cds-dataset"
request = {"request": "stuff"}
result = client.retrieve(DATASET, request) # <- w/o specifying download filename
url = result.location # get the file url separately
out_dir = "some_directory"
file_name = "filename.nc"
# Download in parallel (x8)
subprocess.run([
“aria2c”,
“-x”, “8”,
“-s”, “8”,
“–continue=true”,
“–max-tries=5”,
“–retry-wait=10”,
“–file-allocation=none”,
“–dir”, out_dir,
“-o”, file_name,
url,
], check=True)