-
Notifications
You must be signed in to change notification settings - Fork 5
Checkpointing
ParGenes provides a checkpointing system. This means that if your analysis stops for any reason (user interruption, cluster wall-time reached etc.), you can restart it from the last saved state.
ParGenes will not try to run again (raxml, modeltest etc.) runs that already finished.
In addition, raxml and modeltest themselves implement their own checkpointing system, which allows ParGenes to restart them from their last saved state if they did not finish to analyse an MSA.
To restart ParGenes, you need to type the original command, and to add --continue.
If you don't remember the command you used to run ParGenes at first place, you can find it in the logs.
When restarting ParGenes, we strongly recommend you NOT to change the initial command, unless you know what you are doing.
Some exceptions are:
- The number of cores (
-c): it is ok to first run ParGenes with 256 cores, and then with 16 or 1024 cores. ParGenes will reschedule the jobs accordingly. - Adding an astral step at the end (
--use-astral).
Some examples that will definitively not work with checkpointing:
- Adding some MSAs to the analysis (by adding them in the input directory)
- Changing raxml/modeltest parameters
- Changing the output directory (well this will work, but the checkpoint won't be used)