google/gemma-3-1b-it · Transformers Pipeline Error: AttributeError: 'NoneType' object has no attribute 'apply_chat

Mar 13

Running on Databricks Runtime: 16.2 ML (includes Apache Spark 3.5.2, GPU, Scala 2.12)

! pip install git+https://github.com/huggingface/[email protected]
from transformers import pipeline

pipe = pipeline("text-generation", model="google/gemma-3-1b-it", device="cuda", torch_dtype=torch.bfloat16)

messages = [
[
{
"role": "system",
"content": [{"type": "text", "text": "You are a helpful assistant."},]
},
{
"role": "user",
"content": [{"type": "text", "text": "Write a poem on Hugging Face, the company"},]
},
],
]

output = pipe(messages, max_new_tokens=50)

AttributeError: 'NoneType' object has no attribute 'apply_chat_template'
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-8af4f60d-73d0-4289-a5dd-fc5add3c30b8/lib/python3.12/site-packages/transformers/pipelines/text_generation.py:316, in TextGenerationPipeline.preprocess(self, prompt_text, prefix, handle_long_generation, add_special_tokens, truncation, padding, max_length, continue_final_message, **generate_kwargs)
314 if continue_final_message is None:
315 continue_final_message = prompt_text.messages[-1]["role"] == "assistant"
--> 316 inputs = self.tokenizer.apply_chat_template(
317 prompt_text.messages,
318 add_generation_prompt=not continue_final_message,
319 continue_final_message=continue_final_message,
320 return_dict=True,
321 return_tensors=self.framework,
322 **tokenizer_kwargs,
323 )
324 else:
325 inputs = self.tokenizer(prefix + prompt_text, return_tensors=self.framework, **tokenizer_kwargs)

AiSatan

Mar 13

•

edited Mar 13

pip install git+https://github.com/huggingface/transformers

    model_id = "google/gemma-3-1b-it"

    tokenizer = AutoTokenizer.from_pretrained(model_id)

    pipe = pipeline("text-generation", model=model_id, device="cuda", torch_dtype=torch.bfloat16, tokenizer=tokenizer)

steve122192

Mar 13

from transformers import pipeline, AutoTokenizer

model_id = "google/gemma-3-1b-it"
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model_id, device="cuda", torch_dtype=torch.bfloat16, tokenizer=tokenizer)

KeyError: <class 'transformers.models.gemma3.configuration_gemma3.Gemma3TextConfig'>
File , line 4
1 from transformers import pipeline, AutoTokenizer
2 model_id = "google/gemma-3-1b-it"
----> 4 tokenizer = AutoTokenizer.from_pretrained(model_id)
6 pipe = pipeline("text-generation", model=model_id, device="cuda", torch_dtype=torch.bfloat16, tokenizer=tokenizer, token='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')
9 messages = [
10 [
11 {
(...)
19 ],
20 ]
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-8256526d-05d2-4a9a-a39f-c9fefa37f7df/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py:975, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
973 model_type = config_class_to_model_type(type(config).name)
974 if model_type is not None:
--> 975 tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
977 if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
978 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-8256526d-05d2-4a9a-a39f-c9fefa37f7df/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py:771, in _LazyAutoMapping.getitem(self, key)
769 model_name = self._model_mapping[mtype]
770 return self._load_attr_from_module(mtype, model_name)
--> 771 raise KeyError(key)

AiSatan

Mar 13

what version of transformers you have? mine says 4.50.0.dev0, I installed today:

pip install git+https://github.com/huggingface/transformers

steve122192

Mar 13

Same, I've tried it with both
pip install git+https://github.com/huggingface/transformers
and
pip install git+https://github.com/huggingface/[email protected]
:

"Successfully installed transformers-4.50.0.dev0"

AiSatan

Mar 13

Hm, strange, this is my code I ran today localy, not much but hope it helps, win11, python 3.12:

from transformers import pipeline, AutoTokenizer
import torch

class Helper:
def __init__(self, system_prompt):
    self.system_prompt = system_prompt

    model_id = "google/gemma-3-1b-it"

    tokenizer = AutoTokenizer.from_pretrained(model_id)

    self.pipe = pipeline("text-generation", model=model_id, device="cuda", torch_dtype=torch.bfloat16, tokenizer=tokenizer)


def generate(self, prompt):
    messages = [
        [
            {
                "role": "system",
                "content": [{"type": "text", "text": self.system_prompt}, ]
            },
            {
                "role": "user",
                "content": [{"type": "text", "text": prompt}, ]
            },
        ],
    ]

    output = self.pipe(messages, max_new_tokens=512)[0]

    return output[0]['generated_text'][-1]['content']

sc20001

Mar 14

Same issue here. Any update?

jingwang

Mar 16

pip uninstall transformers

pip install git+https://github.com/huggingface/transformers

works for me. it installed [email protected]

michaelsh

Mar 16

It seems like this might be a Python version issue.
I've tried using Python 2.13 along with all the variations here, but nothing seems to work.

aipsychologicalcounselor

Mar 16

•

edited Mar 16

Same error here is there any solutions? I'm using python 3.12
update: Official code snippet is broken use this instead
https://huggingface.co/google/gemma-3-1b-it/discussions/9#67d2f4acf1fd2d031480e92e

Smjn

Mar 16

•

edited Mar 16

Also getting the same issue with Python 3.12

Edit: As mentioned above, the code snippet is broken. Use the example provided earlier within this thread, calling the class with system and user prompt.

samuel2015

Mar 18

You could try this demo on the ModelCard page.

samuel2015

Mar 18

You could try this demo on the ModelCard page.

lysandre

Google org Mar 18

•

edited Mar 18

Hey all! We've updated the transformers code to fix this issue.
I recommend reinstalling from the gemma3 release (or from main).

This is how you would do it:

Start by uninstalling transformers

pip uninstall transformers

Install from the Gemma-3 tag or main:

pip install git+https://github.com/huggingface/[email protected]
# Or if you want all latest changes, install from main
pip install git+https://github.com/huggingface/transformers

GopiUppari

Google org Mar 19

Hi @steve122192 ,

Tested the code with both the Gemma-3 tag : pip install git+https://github.com/huggingface/[email protected] and main: pip install git+https://github.com/huggingface/transformers worked successfully without errors. Could you please refer to this gist file.

Thanks @lysandre for resolving it

LolaRoseHB

1 day ago

this works, see the messages variable:


    def generate_streaming(self, question, max_new_tokens=1024, temperature=0.7, top_k=50):
        """Generate with streaming output on MPS"""
        # Create streamer
        streamer = TextIteratorStreamer(
            self.tokenizer,
            timeout=10.0,
            skip_prompt=True,
            skip_special_tokens=True
        )

    # Format as chat message
    messages = [{"role": "user", "content": question}]
    inputs = self.tokenizer.apply_chat_template(
        messages,
        return_tensors="pt",
        add_generation_prompt=True
    ).to(self.device)

    # Create attention mask to fix the warning
    attention_mask = torch.ones_like(inputs).to(self.device)

    # MPS-optimized generation parameters
    generation_kwargs = {
        "input_ids": inputs,
        "attention_mask": attention_mask,
        "max_new_tokens": max_new_tokens,
        "temperature": temperature,
        "top_k": top_k,
        "do_sample": True,
        "pad_token_id": self.tokenizer.eos_token_id,
        "streamer": streamer,
        "use_cache": True,
        "num_beams": 1,  # Keep simple for MPS
    }

    # Start generation in separate thread
    thread = Thread(target=self.model.generate, kwargs=generation_kwargs)
    thread.start()

    # Stream the output
    generated_text = ""
    print("Assistant: ", end="", flush=True)
    for new_text in streamer:
        generated_text += new_text
        print(new_text, end="", flush=True)

    thread.join()
    print()  # New line after streaming
    return generated_text

google
/

gemma-3-1b-it

Transformers Pipeline Error: AttributeError: 'NoneType' object has no attribute 'apply_chat_template'