Reading and regridding from cloud-based datasets

I am trying to build a xarray-zarr cloud-based dataset with ERA5 for training a 1° model and am finding that regridding step of the creation is extremely slow. My understanding is that the anemoi dataset .zarr lives in the cloud and so anemoi-dataset just wraps the steps of how to handle the data as it is loaded. My question is, is this the correct approach and if so is there a way to speed this up, i.e., by using GPUs since regridding is matrix multiplications? Thank you!

As a MWE the regridding step of the following is expected to take 3 hours to process a single month of data for only a small subset of the variables:

anemoi-datasets create recipe.yaml gcp_era5.zarr

where recipe.yaml is the following anemoi-dataset configuration:

dates:
  start: 2020-01-01T00:00
  end: 2020-01-31T23:00
  frequency: 6h

input:
  join:
    - pipe:
      - xarray-zarr:
          url: gs://gcp-public-data-arco-era5/ar/1959-2022-full_37-6h-0p25deg_derived.zarr/
          param:
            - 2m_temperature
      - rename:
          param:
            2m_temperature: 2t
      - regrid:
          method: linear
          in_grid: [0.25, 0.25]
          out_grid: O96
    - pipe:
      - xarray-zarr:
          url: gs://gcp-public-data-arco-era5/ar/1959-2022-full_37-6h-0p25deg_derived.zarr/
          param:
            - temperature
      - rename:
          param:
            temperature: t
            level:
              - 1000
              - 850
              - 500
      - regrid:
          method: linear
          in_grid: [0.25, 0.25]
          out_grid: O96

    - forcings:
        template: ${input.join.0.pipe}
        param:
          - cos_latitude
          - cos_longitude
          - sin_latitude
          - sin_longitude
          - cos_julian_day
          - cos_local_time
          - sin_julian_day
          - sin_local_time
          - insolation

Any advice would be appreciated! I.e., should I be downloading all the data first? What package/ framework should I use to regrid TB datasets? Should I be accessing already regridded n320 data from MARS? Thank you!