Llama-3.2-1B-Vision

A vision-enhanced version of the Llama-3.2-1B language model, capable of understanding and describing images while maintaining the base model's language capabilities.

Model Details

Base Model: Llama-3.2-1B
Model Type: Vision-Language Model
Last Updated: December ?, 2024
Model Architecture: Llama architecture with SigLIP vision encoder

Downloads last month: 104

Safetensors

Model size

1.24B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Model tree for kadirnar/Llama-3.2-1B-Vision

Base model

meta-llama/Llama-3.2-1B

Finetuned

(301)

this model

Collection including kadirnar/Llama-3.2-1B-Vision

Llama3 Vision

Collection

3 items • Updated Dec 7, 2024