# Quick Test 1. Download the Hugging Face model from [Tinytron/MLC-Tinytron at main](https://huggingface.co/Tinytron/MLC-Tinytron/tree/main). 2. Execute the Python script: ```bash python bundle_weight.py --apk-path ./app-release.apk ``` # Full Reproduction ## Clone mlc-llm Clone the repository: [Tinytron/mlc-llm: mlc-llm modified by tinytron](https://github.com/Tinytron/mlc-llm). ```bash git clone --recursive https://github.com/Tinytron/mlc-llm.git ``` ## Compile TVM Unity - Use the tvm unity in `mlc-llm/3rdparty` directory - Follow the instruction in [build-from-source](https://llm.mlc.ai/docs/install/tvm.html#option-2-build-from-source) - Modify `set(USE_OPENCL ON)` in config.cmake to enable opencl ## Compile mlc-llm 1. Compile mlc_llm: ```bash cd mlc-llm/ mkdir -p build && cd build python ../cmake/gen_cmake_config.py cmake .. && cmake --build . --parallel $(nproc) && cd .. ``` 2. Install the Python package: ```bash export MLC_LLM_SOURCE_DIR=/path-to-mlc-llm export PYTHONPATH=$MLC_LLM_SOURCE_DIR/python:$PYTHONPATH alias mlc_llm="python -m mlc_llm" ``` ## Build Android Application 1. **Weight Conversion and Config File Generation** Modify `build_mlc_android.sh`, changing `MODEL_PATH` to the corresponding Hugging Face model path, and then execute the script. The converted weights and config files will be generated in the `bundle` directory under the current path. 2. **Package and Compile the Model** Navigate to `mlc_llm/android/MLCChat`, edit `mlc-package-config.json`, and modify it according to the following example. Change the `model` field to the directory of the weights generated for each model conversion. ```json { "device": "android", "model_list": [ { "model": "path-to-qwen", "estimated_vram_bytes": 4250586449, "model_id": "Qwen2-7B-Instruct-Tinytron-MLC", "bundle_weight": true }, { "model": "path-to-llama", "estimated_vram_bytes": 4250586449, "model_id": "Llama3.1-8B-Instruct-Tinytron-MLC", "bundle_weight": true }, { "model": "path-to-phi2", "estimated_vram_bytes": 4250586449, "model_id": "Phi-2-Tinytron-preview-MLC", "bundle_weight": true }, { "model": "path-to-cauchy", "estimated_vram_bytes": 4250586449, "model_id": "Cauchy-3B-preview-MLC", "bundle_weight": true } ] } ``` Then execute: ```bash mlc_llm package ``` 3. **Generate APK** Use Android Studio to generate the APK, and then execute the installation script: ```bash python bundle_weight.py --apk-path ./app/release/app-release.apk ``` # Successful Run Screenshots We successfully ran the models phi2 and cauchy on a P70 phone with 12GB of RAM, and all models on an iQOO 12 Pro phone with 16GB of RAM. Below are the screenshots of the successful runs. ## P70 ![phi_12G.jpg#455px #998px](screenshot/phi_12G.jpg) ![cauchy_12G.jpg#455px #998px](screenshot/cauchy_12G.jpg) ## IQOO 12 Pro ![qwen_16G.jpg#455px #998px](screenshot/qwen_16G.jpg) ![llama_16G.jpg#455px #998px](screenshot/llama_16G.jpg) ![phi_16G.jpg#455px #998px](screenshot/phi_16G.jpg) ![cauchy_16G.jpg#455px #998px](screenshot/cauchy_16G.jpg)