ibm-granite
/

granite-vision-3.1-2b-preview

@@ -48,16 +48,18 @@ The model is intended to be used in enterprise applications that involve process
 ## Generation:
-Granite Vision model is supported natively `transformers>=4.48`. Below is a simple example of how to use the `granite-vision-3.1-2b-preview` model.
 ### Usage with `transformers`
 ```python
-from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
 model_path = "ibm-granite/granite-vision-3.1-2b-preview"
-processor = LlavaNextProcessor.from_pretrained(model_path)
-model = LlavaNextForConditionalGeneration.from_pretrained(model_path, device_map="cuda:0")
 # prepare image and text prompt, using the appropriate prompt template
 url = "https://github.com/haotian-liu/LLaVA/blob/1a91fc274d7c35a9b50b3cb29c4247ae5837ce39/images/llava_v1_5_radar.jpg?raw=true"
@@ -77,7 +79,7 @@ inputs = processor.apply_chat_template(
     tokenize=True,
     return_dict=True,
     return_tensors="pt"
-).to("cuda:0")
 # autoregressively complete prompt

 ## Generation:
+Granite Vision model is supported natively `transformers` from the `main` branch. Below is a simple example of how to use the `granite-vision-3.1-2b-preview` model.
 ### Usage with `transformers`
 ```python
+from transformers import AutoProcessor, AutoModelForVision2Seq
+device = "cuda" if torch.cuda.is_available() else "cpu"
 model_path = "ibm-granite/granite-vision-3.1-2b-preview"
+processor = AutoProcessor.from_pretrained(model_path)
+model = AutoModelForVision2Seq.from_pretrained(model_path).to(device)
 # prepare image and text prompt, using the appropriate prompt template
 url = "https://github.com/haotian-liu/LLaVA/blob/1a91fc274d7c35a9b50b3cb29c4247ae5837ce39/images/llava_v1_5_radar.jpg?raw=true"
     tokenize=True,
     return_dict=True,
     return_tensors="pt"
+).to(device)
 # autoregressively complete prompt