Wow so fast!

#1
by sujitvasanth - opened

thanks this model is great! I was using bnb and transformers for UITARS-1.5 2B for computer use
...this is so much faster on exllamav2!!!!
It took a l;ittle work to rearrange my pipeline but works flawlessly
I wish you could please quantise openCUA7B?? Holo1.5?

was a lot of hard work but got OpenCUA7B quantised working on exllamav2... needed to build a custom arcitecture profile and some epxerimentation
https://huggingface.co/sujitvasanth/OpenCUA-7B-exl2
just uploading at present and successfuly tried basic image-text prompts in ubuntu havent fully tested or optimised yet.

Sorry, I just saw your message.

I'm pretty busy with college and don't have much free time. When I have time and see interesting unquantized models, I'll continue quantizing more models.

I got openCUA fully working now .. its a non-standard model so had to develop special inference code to replace the transformer custom code and needed exllamav2/architecture.py monkeypatch for quantization and inference.
lease checj it out at https://huggingface.co/sujitvasanth/OpenCUA-7B-exl2

I've also written some special "computer use" inference software for your and my models and for other computer use models such as Halo1.5

Sign up or log in to comment