Meet this error

#11
by crm-ai - opened

File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/configuration_utils.py", line 214, in getattribute
return super().getattribute(key)
AttributeError: 'Gemma3Config' object has no attribute 'vocab_size'

Here's some more detail

Description of Issue:

I am encountering an AttributeError: 'Gemma3Config' object has no attribute 'vocab_size' when attempting to load the google/gemma-3-27b-it model using the transformers library from source. This error occurs because the config.json file for this model is missing the vocab_size attribute, which the transformers library expects.

Steps to Reproduce:

  1. Install the latest transformers library from source: %pip install git+https://github.com/huggingface/transformers.git

  2. Attempt to load the model using the following code:

    from transformers import Gemma3ForCausalLM, Gemma3Config
    from huggingface_hub import hf_hub_download
    import json
    
    model_id = "google/gemma-3-27b-it"
    huggingface_token = "YOUR_HUGGINGFACE_TOKEN" #replace with your token
    config = Gemma3Config.from_pretrained(model_id, token = huggingface_token)
    model = Gemma3ForCausalLM.from_pretrained(model_id, config=config, token = huggingface_token)
    
  3. The following error occurs:

    AttributeError: 'Gemma3Config' object has no attribute 'vocab_size'
    File .../transformers/configuration_utils.py:214, in PretrainedConfig.__getattribute__(self, key)
        212 if key != "attribute_map" and key in super().__getattribute__("attribute_map"):
        213     key = super().__getattribute__("attribute_map")[key]
    --> 214 return super().__getattribute__(key)
    

config.json Output:

{
  "architectures": ["Gemma3ForConditionalGeneration"],
  "boi_token_index": 255999,
  "eoi_token_index": 256000,
  "eos_token_id": [1, 106],
  "image_token_index": 262144,
  "initializer_range": 0.02,
  "mm_tokens_per_image": 256,
  "model_type": "gemma3",
  "text_config": {
    "head_dim": 128,
    "hidden_size": 5376,
    "intermediate_size": 21504,
    "model_type": "gemma3_text",
    "num_attention_heads": 32,
    "num_hidden_layers": 62,
    "num_key_value_heads": 16,
    "query_pre_attn_scalar": 168,
    "rope_scaling": {
      "factor": 8.0,
      "rope_type": "linear"
    },
    "sliding_window": 1024
  },
  "torch_dtype": "bfloat16",
  "transformers_version": "4.50.0.dev0",
  "vision_config": {
    "hidden_size": 1152,
    "image_size": 896,
    "intermediate_size": 4304,
    "model_type": "siglip_vision_model",
    "num_attention_heads": 16,
    "num_hidden_layers": 27,
    "patch_size": 14,
    "vision_use_head": false
  }
}

@Renu11 any thoughts on this one?

im getting the same error when trying to run off terminal, I am on linux

vllm serve "google/gemma-3-27b-it"

gives me a wall of text and this error: INFO 03-12 11:46:46 cuda.py:229] Using Flash Attention backend.
INFO 03-12 11:46:47 model_runner.py:1110] Starting to load model google/gemma-3-27b-it...
WARNING 03-12 11:46:47 utils.py:78] Gemma3ForConditionalGeneration has no vLLM implementation, falling back to Transformers implementation. Some features may not be supported and performance may not be optimal.
INFO 03-12 11:46:47 transformers.py:129] Using Transformers backend.
ERROR 03-12 11:46:47 engine.py:400] 'Gemma3Config' object has no attribute 'vocab_size'

The following works for me, though some other problems come into being:

config = AutoConfig.from_pretrained(base_model_name)
for key, value in vars(config.text_config).items():
setattr(config, key, value)

base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
config=config,
quantization_config=quantization_config,
use_flash_attention_2=True,
torch_dtype=torch.bfloat16,
device_map = "cpu")

This should work:

device_map = (
        {"": f"xpu:{Accelerator().local_process_index}"}
        if is_xpu_available()
        else {"": Accelerator().local_process_index}
    )

base_model = Gemma3ForConditionalGeneration.from_pretrained(
        base_model_name,
        quantization_config=quantization_config,
        use_flash_attention_2=True,
        torch_dtype=torch.bfloat16,
        device_map = device_map
    )
Google org

Apologies for the delayed response. We can confirm that this issue has been addressed and resolved in Transformers version 4.53.0. Please try again by installing the latest transformers version (4.53.0) using !pip install -U transformers and load the gemma-3-27b-it model using following code-

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("google/gemma-3-27b-it")

Please let us know if this resolves the issue or if you continue to experience the same problem. Thank you.

Sign up or log in to comment