Skip to content

inria-thoth/SinkFast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Challenges in Non-Polymeric Crystal Structure Prediction: Why a Geometric, Permutation-Invariant Loss is Needed

Authors: [E. Jehanno, R. Menegaux, J. Mairal and S. Grudinin]

0. Original Framework

This code is adapted from the AssembleFlow Framework under MIT License.

1. Environment

micromamba env create -n sinkfast -f env.gpu.yml

2. Dataset: COD-Cluster17

COD-Cluster17 is obtained from the CrystalFlow project, and it is available at this HuggingFace link.

However it is not the AssembleFlow data and should be reprocessed. To do so, you must follow:

  1. First download CrystalFlow data from https://huggingface.co/datasets/chao1224/CrystalFlow.

  2. Then revert it to raw:

python AssembleFlow/datasets/revert_to_raw.py --cod_main_dir=<path/to/CrystalFlow_COD> --cod_sub_dir=<subset_name/either/processed_or_processed_5000>

This will create a folder ./AssembleFlow_data/COD/raw.

  1. This ./AssembleFlow_data/COD/raw can then be reprocessed directly when running AssembleFlow as it will build a ./AssembleFlow_data/COD/<processed_subset> folder if it doesn't exist. This uses the processing code directly from AssembleFlow Framework.

3. Scripts

A demo python script is:

python main.py --dataset=COD_5000 --seed=42 --model=AssembleFlow_Atom --data_root=AssembleFlow_data/COD --output_model_dir=COD_5000/Atom_LossRMSD_DiffAssign_DR_42/ --num_timesteps=1 --inference_interval=1

If we are using the pretrained checkpoints, then we can specify:

...... --load_pretrained=1

in for instance:

python main.py --dataset=COD_5000 --seed=42 --model=AssembleFlow_Atom --data_root=AssembleFlow_data/COD --output_model_dir=model_weights/COD17_5000/direct_regression/Atom_LossRMSD_DiffAssign_DR_seed42/ --num_timesteps=1 --inference_interval=1 --load_pretrained=1

4. Results

4.1. State-of-the-art

Our method reaches sota on COD-Cluster17:

With much faster execution time:

4.2. Ablation

4.3. Visualizations

Targets Predictions

5. Cite Us

@article{jehanno2025challenges,
  title={Challenges in Non-Polymeric Crystal Structure Prediction: Why a Geometric, Permutation-Invariant Loss is Needed},
  author={Jehanno, Emmanuel and Menegaux, Romain and Mairal, Julien and Grudinin, Sergei},
  journal={arXiv preprint arXiv:2509.00832},
  year={2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages