Skip to content

add VFRtok support#1

Open
rotem-nagish wants to merge 2 commits intomainfrom
VFRtok
Open

add VFRtok support#1
rotem-nagish wants to merge 2 commits intomainfrom
VFRtok

Conversation

@rotem-nagish
Copy link
Collaborator

No description provided.


# Install Python dependencies from requirements.txt
# Note: Some packages may already be in the base image, skip torch/torchvision/torchaudio
RUN pip install --no-cache-dir \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they have requirements.txt, so it is better if we use an opt out approach

# Install requirements, skipping packages already in base image
# Base image includes: torch, torchvision, torchaudio, flash-attn, apex
RUN grep -v -E "^(torch|torchvision|torchaudio|apex)" requirements.txt > requirements_temp.txt && \
pip install --no-cache-dir -r requirements_temp.txt && \
rm requirements_temp.txt

Comment on lines 96 to 98
RUN wget -q --show-progress \
"https://huggingface.co/KwaiVGI/VFRTok/resolve/main/vfrtok-l.bin" \
-O vfrtok-l.bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally better to download weights on the machine, rather than inside the dockerfile (that way it always happens just once)

hf download KwaiVGI/VFRTok

and when running the docker add

  -v /shared/.cache:/root/.cache\

(in the README, like we do for Wan2.2)

echo "/workspace/VFRTok/sample_data/sample.mp4" >> /workspace/VFRTok/test.csv

# Add metadata to CSV
RUN python scripts/add_metadata_to_csv.py -i /workspace/VFRTok/test.csv -o /workspace/VFRTok/test.csv --data_column video_path || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing files should exist outside the dockerfile, and they can be mounted. see the example in Wan2.2

deepspeed inference.py -i /workspace/VFRTok/test.csv -o /workspace/VFRTok/outputs \
--config configs/vfrtok-l.yaml --ckpt vfrtok-l.bin \
--enc_fps 24 --dec_fps 60
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be generally more minimal. can ask for it to look at other READMEs, they are all shorter., the notes section is useless, etc

@AmitMY
Copy link
Contributor

AmitMY commented Feb 3, 2026

FYI, you can also tell claude:

Look at the comments left on #1 and address them

image: vfrtok:latest
environment:
- NVIDIA_VISIBLE_DEVICES=all
- HF_HUB_DISABLE_XET=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why disable xet?

docker compose run --rm vfrtok \
deepspeed inference.py -i test.csv -o outputs \
--config configs/vfrtok-l.yaml --ckpt vfrtok-l.bin --enc_fps 24
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer having docker run command like we have for other repositories, with all the steps - for example i don't know where test.csv is from.

so:

Build:

docker build -t vfrtok:latest -f repositories/KlingTeam/VFRTok/Dockerfile .

Run:

mkdir -p outputs
docker run --it --rm --gpus all \
  --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
  -v /shared/.cache/huggingface:/root/.cache/huggingface \
  -v "$(pwd)/outputs":/outputs \
  -it vfrtok:latest bash

Usage:
Missing - how did we create test.csv? where is the video?

deepspeed inference.py -i test.csv -o /outputs \
    --config configs/vfrtok-l.yaml --ckpt vfrtok-l.bin --enc_fps 24

@@ -0,0 +1,27 @@
# LTX-2 Video VAE Tokenizer
FROM gogamza/unsloth-vllm-gb10:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Comment on lines +6 to +17
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
libgl1 \
&& rm -rf /var/lib/apt/lists/*

# Install ffmpeg 4 from source (required for decord)
COPY libraries/ffmpeg/install_from_source.sh /tmp/install_ffmpeg.sh
RUN bash /tmp/install_ffmpeg.sh

# Install decord from source
COPY libraries/decord/install_from_source.sh /tmp/install_decord.sh
RUN bash /tmp/install_decord.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are using the unsloth image. are all these necessary? do they not come already with the unsloth image?

RUN pip install --no-cache-dir \
mediapy \
simple-video-utils \
git+https://github.com/huggingface/diffusers.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have to install from git?


```bash
# Reconstruct video
docker compose run --rm ltx2 python -m video_tokenizer.bin \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have a run.sh script -
so either usage should use ./run.sh encode <input> <output> etc, or it should be a docker run command but then no run.sh needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants