--- license: apache-2.0 base_model: - Qwen/Qwen2.5-VL-32B-Instruct --- # TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials Model trained from [GUI-Net Dataset](https://huggingface.co/datasets/Bofeee5675/GUI-Net-1M) See detail at our [Project Page](https://github.com/TongUI-agent/TongUI-agent) ## Model Details The base model is `Qwen/Qwen2.5-VL-32B-Instruct`. We fine-tuned base model by Lora. **Note:** Due to large size of 32B model, we only release the LoRA part of this model. To merge the weights, use the following script: ```python from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration, AutoConfig, AutoModelForImageTextToText import torch from peft.peft_model import PeftModel def load_model_and_processor(model_path, precision="bf16", lora_path=None, merge_lora=True): """ Load the Qwen2.5-VL model and processor with optional LoRA weights. Args: args: Arguments containing: - model_path: Path to the base model - precision: Model precision ("fp16", "bf16", or "fp32") - lora_path: Path to LoRA weights (optional) - merge_lora: Boolean indicating whether to merge LoRA weights Returns: tuple: (processor, model) - The initialized processor and model """ # Initialize processor try: processor = AutoProcessor.from_pretrained( model_path ) except Exception as e: print(f"Error loading processor: {e}") processor = None config = AutoConfig.from_pretrained(model_path) print(config) raise e # Initialize base model from transformers import Qwen2_5_VLForConditionalGeneration # Initialize base model model_cls = Qwen2_5_VLForConditionalGeneration model = model_cls.from_pretrained( model_path, device_map="auto", torch_dtype=torch.float16 if precision == "fp16" else torch.bfloat16 if precision == "bf16" else torch.float32, attn_implementation="flash_attention_2", ) # Load LoRA weights if path is provided if lora_path is not None and len(lora_path) > 0: print(f"Loading LoRA weights from {lora_path}") model = PeftModel.from_pretrained(model, lora_path) if merge_lora: print("Merging LoRA weights into base model") model = model.merge_and_unload() model.eval() return processor, model ``` `model_path` is the base model, and `lora_path` is where you download this repo.