Bofeee5675
/

TongUI-32B

Model card Files Files and versions Community

TongUI-32B / README.md

Bofeee5675's picture

Update README.md

acc7664 verified 2 months ago

|

history blame contribute delete

2.57 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-VL-32B-Instruct
	---
	# TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials

	Model trained from [GUI-Net Dataset](https://huggingface.co/datasets/Bofeee5675/GUI-Net-1M)

	See detail at our [Project Page](https://github.com/TongUI-agent/TongUI-agent)


	## Model Details

	The base model is `Qwen/Qwen2.5-VL-32B-Instruct`. We fine-tuned base model by Lora.

	Note: Due to large size of 32B model, we only release the LoRA part of this model. To merge the weights, use the following script:

	```python
	from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration, AutoConfig, AutoModelForImageTextToText
	import torch
	from peft.peft_model import PeftModel

	def load_model_and_processor(model_path, precision="bf16", lora_path=None, merge_lora=True):
	"""
	Load the Qwen2.5-VL model and processor with optional LoRA weights.

	Args:
	args: Arguments containing:
	- model_path: Path to the base model
	- precision: Model precision ("fp16", "bf16", or "fp32")
	- lora_path: Path to LoRA weights (optional)
	- merge_lora: Boolean indicating whether to merge LoRA weights

	Returns:
	tuple: (processor, model) - The initialized processor and model
	"""
	# Initialize processor
	try:
	processor = AutoProcessor.from_pretrained(
	model_path
	)
	except Exception as e:
	print(f"Error loading processor: {e}")
	processor = None
	config = AutoConfig.from_pretrained(model_path)
	print(config)
	raise e
	# Initialize base model
	from transformers import Qwen2_5_VLForConditionalGeneration
	# Initialize base model
	model_cls = Qwen2_5_VLForConditionalGeneration
	model = model_cls.from_pretrained(
	model_path,
	device_map="auto",
	torch_dtype=torch.float16 if precision == "fp16" else torch.bfloat16 if precision == "bf16" else torch.float32,
	attn_implementation="flash_attention_2",
	)

	# Load LoRA weights if path is provided
	if lora_path is not None and len(lora_path) > 0:
	print(f"Loading LoRA weights from {lora_path}")
	model = PeftModel.from_pretrained(model, lora_path)

	if merge_lora:
	print("Merging LoRA weights into base model")
	model = model.merge_and_unload()

	model.eval()

	return processor, model
	```

	`model_path` is the base model, and `lora_path` is where you download this repo.