mfarre HF staff commited on
Commit
06ac4f8
·
verified ·
1 Parent(s): 5719134

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -46,7 +46,7 @@ SmolVLM2-500M-Video is a lightweight multimodal model designed to analyze video
46
 
47
  SmolVLM2 can be used for inference on multimodal (video / image / text) tasks where the input consists of text queries along with video or one or more images. Text and media files can be interleaved arbitrarily, enabling tasks like captioning, visual question answering, and storytelling based on visual content. The model does not support image or video generation.
48
 
49
- To fine-tune SmolVLM2 on a specific task, you can follow [the fine-tuning tutorial](UPDATE).
50
 
51
  ## Evaluation
52
 
 
46
 
47
  SmolVLM2 can be used for inference on multimodal (video / image / text) tasks where the input consists of text queries along with video or one or more images. Text and media files can be interleaved arbitrarily, enabling tasks like captioning, visual question answering, and storytelling based on visual content. The model does not support image or video generation.
48
 
49
+ To fine-tune SmolVLM2 on a specific task, you can follow [the fine-tuning tutorial](https://github.com/huggingface/smollm/blob/main/vision/finetuning/Smol_VLM_FT.ipynb).
50
 
51
  ## Evaluation
52