Skip to content

raideno/MoLiNER

Repository files navigation

MoLiNER: Motion-Language-based Instance Segmentation and Retrieval

MoLiNER for motion analysis, language-based segmentation and retrieval.

Documentation

A more detailed documentation is available in the docs directory.

Installation

To get started with MoLiNER, you need to set up the environment and install the required dependencies.

1. Clone the repository:
git clone https://github.com/raideno/MoLiNER.git
cd MoLiNER
2. Setup Python Environment:
python -m venv .venv
# On macOS and Linux
source .venv/bin/activate
# On Windows
.venv\Scripts\activate
# NOTE: installing the dependencies
pip install -r requirements.txt
3. Hugging Face Authentication (IMPORTANT):

Rename the .env.example to .env and replace the xxx value with the appropriate values.

4. TMR Pretrained Weights:
bash scripts/download-tmr-pretrained-models.sh

Dataset Preparation

This project supports multiple datasets: Babel, HumanML3D, and KIT-ML (upcoming). All data downloading, preprocessing, etc is done automatically. The only required thing is to specify the HUGGING_FACE_TOKEN inside of the .env as instructed in the previous step, this is required as the raw data is hosted on hugging face in a private repository.

All data related code is available inside of src/data and the associated configuration inside of configs/data.

A set of predefined data pipelines with filtering are already available and listed in # Data Pipelines. It is also possible to create your custom data pipeline with your custom filtering, instructions are available at docs/create-data-pipeline.md.

Training

To train a new model, use the train-model.py script.

  1. Create the Model:

Duplicate the configs/model/moliner.base.yaml file and name it as you wish.

  1. Specify the Modules:

Replace all the ??? in the .yaml file with one of the possible values for each module.

  1. Start the Training:
HYDRA_FULL_ERROR=1 TOKENIZERS_PARALLELISM=false python train-model.py \
    model=<MODEL_NAME> \
    data=<DATA_PIPELINE_NAME> \
    trainer.accelerator=cuda \
    +trainer.devices=[0]

NOTEs:

  • For more control on the trainer, you can change the configs/trainer.yaml.
  • <MODEL_NAME> should be set the the name of the file you just created without the .yaml extension.
  • <DATA_PIPELINE_NAME> possible values can be found at configs/data and are also listed in #data-pipelines section.
Data Pipelines

Data Pipelines

Data Variants Description
babel/base Babel dataset for motion-language segmentation.
babel/separate Frame and sequence annotations are put in different samples.
babel/20/base Babel dataset with sequence-level annotations.
babel/20/standardized/chunking/16 Babel dataset with chunk-based annotations. 16 Frames per span.
babel/20/standardized/windowing/16 Babel dataset with window-based annotations. 16 Frames per span.
hml3d/base HumanML3D dataset for 3D motion-language tasks.
mixed/base A mix of HML3D and Babel dataset.

RUN_DIR: Once training started, a directory inside the out directory will be created, model weights, logs, etc will be stored there, this directory will be referred to as run_dir in the rest of the documentation.

Configurations

Configurations

This project uses Hydra for configuration management. This allows for a flexible and composable way to configure experiments.

The main configuration files are located in the configs/ directory.

You can override any configuration setting from the command line. For example:

python train-model.py data=<data-name> model=<model-name> trainer.max_epochs=100

This command will train the specified model on the specified dataset for 100 epochs.

Refers to Hydra's Documentation for more infos about how to use Hydra.

Weights Extraction

Before running evaluation or inference, you might want to extract the model weights from the PyTorch Lightning checkpoint for easier loading.

python extract.py run_dir=<path_to_run_dir>

This will save the model weights in the run directory.

Evaluation

Retrieval Evaluation

To evaluate the model's performance on in-motion retrieval tasks:

HYDRA_FULL_ERROR=1 python evaluate.mlp.py \
    run_dir=<path_to_run_dir> \
    device=cuda:1 \
    score=0.5

This supports HumanML3D, Babel sequence-level, and Babel frame-level datasets.

Segmentation Evaluation

To evaluate the model on segmentation tasks with the Babel frame-level dataset:

HYDRA_FULL_ERROR=1 python evaluate.locate.py \
    run_dir=<path_to_run_dir> \
    device=cuda:1 \
    score=0.5

Pre-trained Models

Upcoming...

About

Moliner.

Resources

Stars

Watchers

Forks