3ib0n
/

Qwen2-VL-2B-rkllm

Image-Text-to-Text

Model card Files Files and versions Community

Qwen2-VL-2B-rkllm / README.md

3ib0n's picture

Upload ./README.md with huggingface_hub

c2b26cc verified 5 months ago

|

history blame contribute delete

3.17 kB

	---
	base_model:
	- Qwen/Qwen2-VL-2B
	pipeline_tag: image-text-to-text
	library_name: transformers
	tags:
	- rknn
	- rkllm
	- chat
	- vision
	- rk3588
	- multimodal
	---
	## 3ib0n's RKLLM Guide
	These models and binaries require an RK3588 board running rknpu driver version 0.9.7 or above

	## Steps to reproduce conversion
	```shell
	# Download and setup miniforge3
	curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
	bash Miniforge3-$(uname)-$(uname -m).sh

	# activate the base environment
	source ~/miniforge3/bin/activate

	# create and activate a python 3.8 environment
	conda create -n rknn-llm-1.1.4 python=3.8
	conda activate rknn-llm-1.1.4

	# clone the lastest rknn-llm toolkit
	git clone https://github.com/airockchip/rknn-llm.git

	# update the following 4 files to your desired models and output locations
	cd rknn-llm/examples/rkllm_multimodal_demo
	nano export/export_vision.py # update model path and output path
	nano export/export_vision_rknn.py # update model path
	nano export/export_rkllm.py # update input and output paths
	nano data/make_input_embeds_for_quantize.py # update model path

	# intstall necessary dependencies for the above
	pip install transformers accelerate torchvision rknn-toolkit2==2.2.1
	pip install --upgrade torch pillow # necessary to use vision models with opset_version=18

	# export vision models and create input embeddings
	cd export/
	python export_vision.py
	python export_vision_rknn.py
	cd ..
	python data/make_input_embeds_for_quantize.py

	# install rkllm and export the language model
	pip install ../../rkllm-toolkit/packages/rkllm_toolkit-1.1.4-cp38-cp38-linux_x86_64.whl
	python export/export_rkllm.py
	```

	## Steps to build and run demo

	```shell
	# Dwonload the correct toolchain for working with rkllm
	# Documentation here: https://github.com/airockchip/rknn-llm/blob/main/doc/Rockchip_RKLLM_SDK_EN_1.1.0.pdf
	wget https://developer.arm.com/-/media/Files/downloads/gnu-a/10.2-2020.11/binrel/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz
	tar -xz gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz

	# ensure that the gcc compiler path is set to the location where the toolchain dowloaded earlier is unpacked
	nano deploy/build-linux.sh # update the gcc compiler path

	# compile the demo app
	cd delpoy/
	./build-linux.sh
	```

	## Steps to run the app
	More information and original guide: https://github.com/airockchip/rknn-llm/tree/main/examples/rkllm_multimodel_demo
	```shell
	# push install dir to device
	adb push ./install/demo_Linux_aarch64 /data
	# push model file to device
	adb push qwen2_vl_2b_vision_rk3588.rknn /data/models
	adb push Qwen2-VL-2B-Instruct.rkllm /data/models
	# push demo image to device
	adb push ../data/demo.jpg /data/demo_Linux_aarch64

	adb shell
	cd /data/demo_Linux_aarch64
	# export lib path
	export LD_LIBRARY_PATH=./lib
	# soft link models dir
	ln -s /data/models .
	# run imgenc
	./imgenc models/qwen2_vl_2b_vision_rk3588.rknn demo.jpg
	# run llm(Pure Text Example)
	./llm models/Qwen2-VL-2B-Instruct.rkllm 128 512
	# run demo(Multimodal Example)
	./demo demo.jpg models/qwen2_vl_2b_vision_rk3588.rknn models/Qwen2-VL-2B-Instruct.rkllm 128 512
	```