This comprehensive guide covers everything you need to set up and run OmniVideo for training and inference.
# Create and activate new conda environment
conda create -n omnivideo python=3.10
conda activate omnivideo
# Install from our curated requirements
pip install -r project_requirements.txt
# Or recreate exact environment
conda env create -f environment.yml
# Create virtual environment
python -m venv omnivideo_env
source omnivideo_env/bin/activate # Linux/Mac
# Install all dependencies
pip install -r pip_requirements.txt
The models should be organized in the omni_ckpts
directory as follows:
omni_ckpts/
├── wan/
│ └── wanxiang1_3b/ # WAN model checkpoints
├── adapter/
│ └── model.pt # Adapter model checkpoint
├── vision_head/
│ └── vision_head/ # Vision head checkpoints
├── transformer/
│ └── model.pt # Transformer model checkpoint
├── ar_model/
│ └── checkpoint/ # AR model checkpoint
├── unconditioned_context/
│ └── context.pkl # Unconditioned context for classifier-free guidance
└── special_tokens/
└── tokens.pkl # Special token embeddings
git clone <repository-url>
cd omini_video
# For CUDA (adjust path as needed)
export CUDA_HOME="/usr/local/cuda"
export PATH="${CUDA_HOME}/bin:${PATH}"
export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
# Add to inference or training script
export PYTHONPATH="${PWD}:${PWD}/nets/third_party:${PYTHONPATH}"
python -c "
import torch
import transformers
import deepspeed
print(f'✅ PyTorch: {torch.__version__}')
print(f'✅ Transformers: {transformers.__version__}')
print(f'✅ DeepSpeed: {deepspeed.__version__}')
print(f'✅ CUDA Available: {torch.cuda.is_available()}')
"
# Run inference with sample data
bash tools/inference/inference.sh
# Quick training with sample data
bash finetune.sh
We provide sample data for quick testing:
# Sample data already included in:
examples/finetune_data/
├── t2i_sample/ # Text-to-Image samples (4 files)
├── i2i_sample/ # Image-to-Image samples (4 files)
├── t2v_sample/ # Text-to-Video samples (4 files)
├── t2i_sample_paths.txt
├── i2i_sample_paths.txt
└── t2v_sample_paths.txt
The detailed introduciton of data preparation can be found in tools/data_prepare/DATA_PREPARE.md
configs/foster/omnivideo_mixed_task_1_3B.yaml
omni_ckpts/
directory