So many bug!
The code can't completely run
the code is fine, can u specify ur problem first 😐
why?
OSError: BAAI/Video-XL-2 does not appear to have a file named multimodal_encoder.builder.py. Checkout 'https://huggingface.co/BAAI/Video-XL-2/tree/main' for available files.
why?
OSError: BAAI/Video-XL-2 does not appear to have a file named multimodal_encoder.builder.py. Checkout 'https://huggingface.co/BAAI/Video-XL-2/tree/main' for available files.
Thx for your feedback, we will fix it today.
why?
OSError: BAAI/Video-XL-2 does not appear to have a file named multimodal_encoder.builder.py. Checkout 'https://huggingface.co/BAAI/Video-XL-2/tree/main' for available files.
Hi Mr.Liu, We've fixed the problem. Please try again using the following steps:
- Update inference code:
huggingface-cli download BAAI/Video-XL-2 --include "*.py" --local-dir /root/Models/Video-XL-2
- Run updated demo code:
1. Inference w/o. Efficiency Optimization
from transformers import AutoTokenizer, AutoModel, AutoConfig, BitsAndBytesConfig, AutoModelForCausalLM
import torch
# load model
model_path = '/root/Models/Video-XL-2'
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map=device,quantization_config=None, attn_implementation="sdpa", torch_dtype=torch.float16, low_cpu_mem_usage=True)
gen_kwargs = {
"do_sample": False,
"temperature": 0.01,
"top_p": 0.001,
"num_beams": 1,
"use_cache": True,
"max_new_tokens": 256
}
model.config.enable_sparse = False
# input data
video_path = "/asset/demo.mp4"
question1 = "How many people in the video? (A)3 people (B)6 people. Please only respone the letter"
# params
max_num_frames = 150
sample_fps = 1 # extract frame at 1fps
max_sample_fps = 4
with torch.inference_mode():
response = model.chat(video_path, tokenizer, question1, chat_history=None, return_history=False,max_num_frames=max_num_frames, sample_fps=sample_fps, max_sample_fps=max_sample_fps, generation_config=gen_kwargs)
print(response)
2. Inference w. Chunk-based Pre-filling
from transformers import AutoTokenizer, AutoModel, AutoConfig, BitsAndBytesConfig, AutoModelForCausalLM
import torch
import pdb
import argparse
torch.cuda.reset_peak_memory_stats()
# load model
model_path = '/root/Models/Video-XL-2'
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map=device,quantization_config=None, attn_implementation="sdpa", torch_dtype=torch.float16, low_cpu_mem_usage=True) # sdpa
gen_kwargs = {"do_sample": False, "temperature": 0.01, "top_p": 0.001, "num_beams": 1, "use_cache": True, "max_new_tokens": 128}
model.config.enable_chunk_prefill = True
prefill_config = {
'chunk_prefill_mode': 'streaming',
'chunk_size': 4,
'step_size': 1,
'offload': True,
'chunk_size_for_vision_tower': 24,
}
model.config.prefill_config = prefill_config
# input data
video_path = "/asset/demo.mp4"
question1 = "How many people in the video? (A)3 people (B)6 people. Please only respone the letter"
# params
max_num_frames = 1300
sample_fps = None # uniform sampling
max_sample_fps = None
with torch.inference_mode():
response = model.chat(video_path, tokenizer, question1, chat_history=None, return_history=False,max_num_frames=max_num_frames, sample_fps=sample_fps, max_sample_fps=max_sample_fps, generation_config=gen_kwargs)
peak_memory_allocated = torch.cuda.max_memory_allocated()
print(f"Memory Peak: {peak_memory_allocated / (1024**3):.2f} GB")
print(response)
new trouble
my download code:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True)
The download code works on my local machine, so I suspect the issue might be related to the transformers
version. Please try these steps:
- Align the transformers version by running
pip install transformers==4.43.0
, then try the download again. - If the issue persists, please remove all
.py
files from the download cache (typically located at/root/.cache/huggingface/hub/models--BAAI--Video-XL-2
) and resume the download.
Please let me know if anything new comes up, and I'll address it as soon as possible.
Thank you for your patience, but it seems that some files are missing again:
OSError: BAAI/Video-XL-2 does not appear to have a file named multimodal_resampler.builder.py. Checkout 'https://huggingface.co/BAAI/Video-XL-2/tree/main' for available files.
Thank you for your patience, but it seems that some files are missing again:
OSError: BAAI/Video-XL-2 does not appear to have a file named multimodal_resampler.builder.py. Checkout 'https://huggingface.co/BAAI/Video-XL-2/tree/main' for available files.
It feels a bit weird that the file multimodal_resampler.builder.py
is no longer required after we updated the inference code.
The OSError should not exist.
I guess this is still because the HuggingFace cache contains old .py
files from a previous download.
This issue may be solved in one of two ways:
- Re-download Video-XL-2 into a new directory by specifying a new
cache_dir
:
AutoModelForCausalLM.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True, cache_dir=cache_dir)
This method ensures that both the model weights and the code are freshly downloaded. However, it may take some time since the entire model has to be re-downloaded.
- Move the existing weights to a new cache directory:
To avoid re-downloading the large model weights, you can manually move them from the old Hugging Face cache directory to your new cache directory. Here's how:
Navigate to the current HF cache directory:
cd /root/.cache/huggingface/hub/models--BAAI--Video-XL-2
Move or copy the weight files into your new
cache_dir
.Then load the model using the new cache directory:
AutoModelForCausalLM.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True, cache_dir=cache_dir)
This second approach will only download the latest version of the inference code, without re-downloading the full model weights.
Thank you for your patience, but it seems that some files are missing again:
OSError: BAAI/Video-XL-2 does not appear to have a file named multimodal_resampler.builder.py. Checkout 'https://huggingface.co/BAAI/Video-XL-2/tree/main' for available files.
It feels a bit weird that the file
multimodal_resampler.builder.py
is no longer required after we updated the inference code.
The OSError should not exist.
I guess this is still because the HuggingFace cache contains old.py
files from a previous download.This issue may be solved in one of two ways:
- Re-download Video-XL-2 into a new directory by specifying a new
cache_dir
:
AutoModelForCausalLM.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True, cache_dir=cache_dir)
This method ensures that both the model weights and the code are freshly downloaded. However, it may take some time since the entire model has to be re-downloaded.
- Move the existing weights to a new cache directory:
To avoid re-downloading the large model weights, you can manually move them from the old Hugging Face cache directory to your new cache directory. Here's how:
Navigate to the current HF cache directory:
cd /root/.cache/huggingface/hub/models--BAAI--Video-XL-2
Move or copy the weight files into your new
cache_dir
.Then load the model using the new cache directory:
AutoModelForCausalLM.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True, cache_dir=cache_dir)
This second approach will only download the latest version of the inference code, without re-downloading the full model weights.
OK,I reset my server to make sure the environment is clean.
The transformers version is also 4.43.0.
I still encountered an error.I don't konw what to do
My download code:
from transformers import AutoTokenizer, AutoModelForCausalLM
download_path = "/root/model"
tokenizer = AutoTokenizer.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True, cache_dir=download_path)
model = AutoModelForCausalLM.from_pretrained("BAAI/Video-XL-2", trust_remote_code=True, cache_dir=download_path)