We are experimenting with changing some variables from prognostic to forcing. We are wondering if in changing the model architecture like this we are still able to fine-tune a current AIFS checkpoint or do we necessarily need to start the training from scratch? If we are able to do this, how so?
Hi Julian
Thanks for your question.
We haven’t had a use-case for this, so it’s not a trivial action at the moment. A few options to consider.
- Using transfer_learning and load_weights_only can be used when changing the problem, anemoi should try, and then fail, to load the decoder weights if the number of output variables has changed. This would give you a pretrained encoder and processor. Then you could freeze the pretrained parts and look to only learn the decoder.
- If you want to reuse the decoder, rather than reload it, this would be a bit more manual work. You could edit the checkpoint directly, i.e. deleting parts of the decoder array and changing the metadata, but this would require quite a lot of surgery.
We will keep this in mind as a use-case for future developments to help make it easier.
Hi Matthew,
Thanks for the response. We’ve been using transfer_learning and load_weights_only functions for doing fine-tuning rollouts after pretraining and those flags have been extremely helpful. Option 1 you suggested makes a lot of sense and we’ll try it in the coming weeks when we get there!
Thank you,
Julian