File size: 6,540 Bytes
2aa3aa4 c20c304 2aa3aa4 a5fc16a 2aa3aa4 a5fc16a 2aa3aa4 a5fc16a 2aa3aa4 a5fc16a c20c304 a5fc16a c20c304 d0fe8a1 c20c304 44e4123 a5fc16a c20c304 9f6505d c20c304 a5fc16a c20c304 2aa3aa4 c20c304 2aa3aa4 c20c304 2aa3aa4 c20c304 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
---
library_name: transformers
license: apache-2.0
base_model:
- allenai/Molmo-7B-D-0924
base_model_relation: quantized
tags:
- bitsandbytes
- Molmo
- chat
- multimodal
---
Quantization of the [original Molmo-7B-D-0924](https://huggingface.co/allenai/Molmo-7B-D-0924) model using ```bitsandbytes```.
This model differs from the [one located here](https://huggingface.co/cyan2k/molmo-7B-D-bnb-4bit) in that it includes modified source code to reduce dependencies to achieve the same results and works out of the box.
# NOTE:
The example script below requires an Nvidia GPU and that you `pip install` the CUDA libraries into your virtual environment. This is NOT NECESSARY if you plan to install CUDA on a systemwide basis (as most people do). If you install CUDA systemwide, simply remove the ```set_cuda_paths``` function from the example script, but make sure that you've installed a proper version of CUDA and a compatible version of the Pytorch libraries.
<details><summary>COMPATIBLE CUDA AND PYTORCH 2.2.2 VERSIONS</summary>
Pytorch is only tested with specific versions of CUDA. When using pytorch 2.2.2, the following CUDA versions are required:
- ```pip install nvidia-cublas-cu12==12.1.3.1```
- ```pip install nvidia-cuda-runtime-cu12==12.1.105```
- ```pip install nvidia-cuda-nvrtc-cu12==12.1.105```
- ```pip install nvidia-cufft-cu12==11.0.2.54```
- ```pip install nvidia-cudnn-cu12==8.9.2.26```
- Then install [`torch==2.2.2`](https://download.pytorch.org/whl/cu121/torch/), [`torchvision==0.17`](https://download.pytorch.org/whl/cu121/torchvision/), and [`torchaudio==2.2.2`](https://download.pytorch.org/whl/cu121/torchaudio/) by visiting each of these three links and creating a `pip install` command based on the link for your Python version and platform.
For example, for Windows using Python 3.11 you would use the following:
```
pip install https://download.pytorch.org/whl/cu121/torch-2.2.2%2Bcu121-cp311-cp311-win_amd64.whl#sha256=efbcfdd4399197d06b32f7c0e1711c615188cdd65427b933648c7478fb880b3f
```
```
pip install https://download.pytorch.org/whl/cu121/torchvision-0.17.2%2Bcu121-cp311-cp311-win_amd64.whl#sha256=10ad542aab6b47dbe73c441381986d50a7ed5021cbe01d593a14477ec1f067a0
```
```
pip install https://download.pytorch.org/whl/cu121/torchaudio-2.2.2%2Bcu121-cp311-cp311-win_amd64.whl#sha256=c7dee68cd3d2b889bab71d4a0c345bdc3ea2fe79a62b921a6b49292c605b6071
```
</details>
<details><summary>COMPATIBLE CUDA AND PYTORCH 2.5.1 VERSIONS</summary>
Pytorch is only tested with specific versions of CUDA. When using pytorch 2.5.1, the following CUDA versions are required:
- ```pip install nvidia-cublas-cu12==12.4.5.8```
- ```pip install nvidia-cuda-runtime-cu12==12.4.127```
- ```pip install nvidia-cuda-nvrtc-cu12==12.4.127```
- ```pip install nvidia-cufft-cu12==11.2.1.3```
- ```pip install nvidia-cudnn-cu12==9.1.0.70```
- Then install [`torch==2.5.1`](https://download.pytorch.org/whl/cu124/torch/), [`torchvision==0.20.1`](https://download.pytorch.org/whl/cu124/torchvision/), and [`torchaudio==2.5.1`](https://download.pytorch.org/whl/cu124/torchaudio/) by visiting each of these three links and creating a `pip install` command based on the link for your Python version and platform.
For example, for Windows using Python 3.11 you would use the following:
```
pip install https://download.pytorch.org/whl/cu124/torch-2.5.1%2Bcu124-cp311-cp311-win_amd64.whl#sha256=6c8a7003ef1327479ede284b6e5ab3527d3900c2b2d401af15bcc50f2245a59f
```
```
pip install https://download.pytorch.org/whl/cu124/torchvision-0.20.1%2Bcu124-cp311-cp311-win_amd64.whl#sha256=15796b453a99ed0f0cbc249d129685ddc88157310135fb3addaf738a15db5306
```
```
pip install https://download.pytorch.org/whl/cu124/torchaudio-2.5.1%2Bcu124-cp311-cp311-win_amd64.whl#sha256=b3d75f4e6efc5412fe78c7f2787ee4f39cea1317652e1a47785879cde109f5c4
```
</details>
Example script (process single image):
```Python
import sys
import os
from pathlib import Path
def set_cuda_paths():
venv_base = Path(sys.executable).parent.parent
nvidia_base_path = venv_base / 'Lib' / 'site-packages' / 'nvidia'
cuda_path = nvidia_base_path / 'cuda_runtime' / 'bin'
cublas_path = nvidia_base_path / 'cublas' / 'bin'
cudnn_path = nvidia_base_path / 'cudnn' / 'bin'
nvrtc_path = nvidia_base_path / 'cuda_nvrtc' / 'bin'
paths_to_add = [
str(cuda_path),
str(cublas_path),
str(cudnn_path),
str(nvrtc_path),
]
env_vars = ['CUDA_PATH', 'PATH']
for env_var in env_vars:
current_value = os.environ.get(env_var, '')
new_value = os.pathsep.join(paths_to_add + [current_value] if current_value else paths_to_add)
os.environ[env_var] = new_value
set_cuda_paths()
import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoProcessor, GenerationConfig
model_path = r"[INSERT THE PATH TO THE FOLDER HOLDING THE MODEL FILES HERE]"
class VisionModel:
def __init__(self):
self.model = None
self.processor = None
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
def initialize_model_and_processor(self):
self.processor = AutoProcessor.from_pretrained(
model_path,
trust_remote_code=True,
torch_dtype='auto',
device_map='auto'
)
self.model = AutoModelForCausalLM.from_pretrained(
model_path,
trust_remote_code=True,
torch_dtype='auto',
device_map='auto'
)
def process_single_image(self, image_path):
image = Image.open(image_path)
if image.mode != "RGB":
image = image.convert("RGB")
text = "Describe this image in detail as possible but be succinct and don't repeat yourself."
inputs = self.processor.process(images=[image], text=text)
inputs = {k: v.to(self.device).unsqueeze(0) for k, v in inputs.items()}
output = self.model.generate_from_batch(
inputs,
GenerationConfig(max_new_tokens=500, stop_strings=["<|endoftext|>"]),
tokenizer=self.processor.tokenizer
)
generated_text = self.processor.tokenizer.decode(output[0, inputs['input_ids'].size(1):], skip_special_tokens=True)
print(f"\nGenerated Text:\n{generated_text}\n")
if __name__ == "__main__":
image_path = r"[INSERT THE PATH TO THE IMAGE YOU WANT TO PROCESS HERE]"
vision_model = VisionModel()
vision_model.initialize_model_and_processor()
vision_model.process_single_image(image_path)
``` |