Error for fine tuning model when using FSDP: auto wrap: Could not find the transformer layer class LlavaOnevisionVisionAttention in the model.
#6
by
liuzijing2014
- opened
For my fine tuning argument:
fsdp:
- full_shard
- auto_wrap
It would error out with error of Could not find the transformer layer class LlavaOnevisionVisionAttention in the model.
.
The error happens at here: https://github.com/huggingface/accelerate/blob/main/src/accelerate/utils/dataclasses.py#L1754. And it looks like that _no_split_modules = ["LlavaOnevisionVisionAttention"]
doesn't have a source modular defined within transformer lib?
I take a look of another model (llama). Its _no_split_modules = ["LlamaDecoderLayer"]
and the same LlamaDecoderLayer
could be found within the same modeling_llama.py
file.
Not sure what I am getting wrong. Any help is appreciated!