Video-LLaVA-sort-tool

custom script for Video-LLaVA that can sort videos based on their visual content

This script uses Video-LLaVA to sort videos by repeating the same question for each video, and then moves the videos into directories based on the response.

I recently made CLIP-video-sorter but was unhappy with the results. I found an alternative in Video-LLaVA so I made this script.

Not the cleanest install process, still figuring out some kinks. The directions here should get you up and running.

Requirements

Python >= 3.10

Pytorch == 2.0.1

Anaconda

CUDA Version >= 11.7 but less than 12.4 (Video-LLaVA requires bitsandbytes, which is not compatible with 12.4. (CUDA swap guide)

Not sure about RAM or VRAM requirements. I have 12GB VRAM and 32GB RAM. I'd assume at least 8GB and 16GB are needed, respectively.

~18GB disk space (mostly the models needed to analyze the videos)

Video-LLaVA Install

git clone https://github.com/PKU-YuanGroup/Video-LLaVA
cd Video-LLaVA
conda create -n videollava python=3.10 -y
conda activate videollava
pip install --upgrade pip
pip install -e .
pip install decord opencv-python git+https://github.com/facebookresearch/pytorchvideo.git@28fe037d212663c6a24f373b94cc5d478c8c1a1d

pip install -e ".[train]" and pip install flash-attn --no-build-isolation aren't needed just to run inference on a video in the CLI

Then copy my sortscript.py into the main Video-LLaVA directory

bitsandbytes fix?

load_in_8bit_fp32_cpu_offload=True fix?

Usage

Activate the environment and run my script with python sortscript.py The first time it runs, it will download ~17GB of models.

When trying to run on my system for the first time, after the models are downloaded, there is a warning about bitsandbytes and this error

Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
the quantized model. If you want to dispatch the model on the CPU or the disk while keeping
these modules in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom
`device_map` to `from_pretrained`. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
for more details.

If you get this, just ignore it, close the command window, reopen, activate the environment, and run the script again. Trying to figure out how to deal with this before it happens but no luck yet.
It will ask you for a folder path. Copy and paste from the file explorer.
It will ask you for your sorting question. My questions are usually something like "What kind of X is this? 1 word answer". X could be landscape, dog, car, etc. A question like "What kind of X does Y have? 1 word answer" is good too. I've found that letting the model pick how to describe things works better than giving it options to pick from, like "is this X, Y, or Z?" becuase it can interpret your options incorrectly.
Next the script will loop through the folder of videos and provide its answer to the question for each video.
At the end, the videos are sorted into folders based on the model's answers.
It will ask for a new folder if you want to sort further.

In my initial tests my script worked but had 4 warnings. To fix 2 of them, I had to set temperature and top_p to 1 in ...Video-LLaVA\cache_dir\models--LanguageBind--Video-LLaVA-7B\snapshots\aecae02b7dee5c249e096dcb0ce546eb6f811806\generation_config.json. This file doesn't show up in my file explorer and I only found it with the project folder loaded in Cursor after the script runs for the first time and errors out. I would share the updated file but I don't know how to integrate it from outside the environment. There are other warnings but they don't seem critical, though I welcome any suggestions to fix them.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
sortscript.py		sortscript.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video-LLaVA-sort-tool

Requirements

Video-LLaVA Install

Usage

About

Uh oh!

Releases

Packages

Languages

License

secretlycarl/Video-LLaVA-sort-tool

Folders and files

Latest commit

History

Repository files navigation

Video-LLaVA-sort-tool

Requirements

Video-LLaVA Install

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages