Meet this error
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/configuration_utils.py", line 214, in getattribute
return super().getattribute(key)
AttributeError: 'Gemma3Config' object has no attribute 'vocab_size'
Here's some more detail
Description of Issue:
I am encountering an AttributeError: 'Gemma3Config' object has no attribute 'vocab_size'
when attempting to load the google/gemma-3-27b-it
model using the transformers
library from source. This error occurs because the config.json
file for this model is missing the vocab_size
attribute, which the transformers
library expects.
Steps to Reproduce:
Install the latest
transformers
library from source:%pip install git+https://github.com/huggingface/transformers.git
Attempt to load the model using the following code:
from transformers import Gemma3ForCausalLM, Gemma3Config from huggingface_hub import hf_hub_download import json model_id = "google/gemma-3-27b-it" huggingface_token = "YOUR_HUGGINGFACE_TOKEN" #replace with your token config = Gemma3Config.from_pretrained(model_id, token = huggingface_token) model = Gemma3ForCausalLM.from_pretrained(model_id, config=config, token = huggingface_token)
The following error occurs:
AttributeError: 'Gemma3Config' object has no attribute 'vocab_size' File .../transformers/configuration_utils.py:214, in PretrainedConfig.__getattribute__(self, key) 212 if key != "attribute_map" and key in super().__getattribute__("attribute_map"): 213 key = super().__getattribute__("attribute_map")[key] --> 214 return super().__getattribute__(key)
config.json
Output:
{
"architectures": ["Gemma3ForConditionalGeneration"],
"boi_token_index": 255999,
"eoi_token_index": 256000,
"eos_token_id": [1, 106],
"image_token_index": 262144,
"initializer_range": 0.02,
"mm_tokens_per_image": 256,
"model_type": "gemma3",
"text_config": {
"head_dim": 128,
"hidden_size": 5376,
"intermediate_size": 21504,
"model_type": "gemma3_text",
"num_attention_heads": 32,
"num_hidden_layers": 62,
"num_key_value_heads": 16,
"query_pre_attn_scalar": 168,
"rope_scaling": {
"factor": 8.0,
"rope_type": "linear"
},
"sliding_window": 1024
},
"torch_dtype": "bfloat16",
"transformers_version": "4.50.0.dev0",
"vision_config": {
"hidden_size": 1152,
"image_size": 896,
"intermediate_size": 4304,
"model_type": "siglip_vision_model",
"num_attention_heads": 16,
"num_hidden_layers": 27,
"patch_size": 14,
"vision_use_head": false
}
}
im getting the same error when trying to run off terminal, I am on linux
vllm serve "google/gemma-3-27b-it"
gives me a wall of text and this error: INFO 03-12 11:46:46 cuda.py:229] Using Flash Attention backend.
INFO 03-12 11:46:47 model_runner.py:1110] Starting to load model google/gemma-3-27b-it...
WARNING 03-12 11:46:47 utils.py:78] Gemma3ForConditionalGeneration has no vLLM implementation, falling back to Transformers implementation. Some features may not be supported and performance may not be optimal.
INFO 03-12 11:46:47 transformers.py:129] Using Transformers backend.
ERROR 03-12 11:46:47 engine.py:400] 'Gemma3Config' object has no attribute 'vocab_size'
The following works for me, though some other problems come into being:
config = AutoConfig.from_pretrained(base_model_name)
for key, value in vars(config.text_config).items():
setattr(config, key, value)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
config=config,
quantization_config=quantization_config,
use_flash_attention_2=True,
torch_dtype=torch.bfloat16,
device_map = "cpu")
This should work:
device_map = (
{"": f"xpu:{Accelerator().local_process_index}"}
if is_xpu_available()
else {"": Accelerator().local_process_index}
)
base_model = Gemma3ForConditionalGeneration.from_pretrained(
base_model_name,
quantization_config=quantization_config,
use_flash_attention_2=True,
torch_dtype=torch.bfloat16,
device_map = device_map
)