The raw dataset is TFRecord format: https://github.com/google-research/google-research/blob/master/android_control/README.md And in your [test script](https://github.com/lll6gg/UI-R1/blob/main/evaluation/test.sh), it seems you have done some process on the original data, can you share the scripts?