--- license: cc-by-nc-nd-4.0 base_model: - Qwen/Qwen2.5-1.5B language: - zho - eng - fra - spa - por - deu - ita - rus - jpn - kor - vie - tha - ara tags: - Function_Call - Automotive - SLM - GGUF --- # Qwen2.5-1.5B-Auto-FunctionCaller ## Model Details * **Model Name:** Qwen2.5-1.5B-Auto-FunctionCaller * **Base Model:** [Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) * **Model Type:** Language Model fine-tuned for Function Calling. * **Recommended Quantization:** `Qwen2.5-1.5B-Auto-FunctionCaller.Q4_K_M_I.gguf` * This GGUF file using Q4\_K\_M quantization with Importance Matrix is recommended as offering the best balance between performance and computational efficiency (inference speed, memory usage) based on evaluation. ## Intended Use * **Primary Use:** Function calling extraction from natural language queries within an automotive context. The model is designed to identify user intent and extract relevant parameters (arguments/slots) for triggering vehicle functions or infotainment actions. * **Research Context:** This model was specifically developed and fine-tuned as part of a research publication investigating the feasibility and performance of Small Language Models (SLMs) for function-calling tasks in resource-constrained automotive environments. * **Target Environment:** Embedded systems or edge devices within vehicles where computational resources may be limited. * **Out-of-Scope Uses:** General conversational AI, creative writing, tasks outside automotive function calling, safety-critical vehicle control. ## Performance Metrics The following metrics were evaluated on the `Qwen2.5-1.5B-Auto-FunctionCaller.Q4_K_M_I.gguf` model: * **Evaluation Setup:** * Total Evaluation Samples: 2074 * **Performance:** * **Exact Match Accuracy:** 0.8414 * **Average Component Accuracy:** 0.9352 * **Efficiency & Confidence:** * **Throughput:** 10.31 tokens/second * **Latency (Per Token):** 0.097 seconds * **Latency (Per Instruction):** 0.427 seconds * **Average Model Confidence:** 0.9005 * **Calibration Error:** 0.0854 *Note: Latency and throughput figures are hardware-dependent and should be benchmarked on the target deployment environment.* ## Limitations * **Domain Specificity:** Performance is optimized for automotive function calling. Generalization to other domains or complex, non-structured conversations may be limited. * **Quantization Impact:** The `Q4_K_M_I` quantization significantly improves efficiency but may result in a slight reduction in accuracy compared to higher-precision versions (e.g., FP16). * **Complex Queries:** May struggle with highly nested, ambiguous, or unusually phrased requests not well-represented in the fine-tuning data. * **Safety Criticality:** This model is **not** intended or validated for safety-critical vehicle operations (e.g., braking, steering). Use should be restricted to non-critical systems like infotainment and comfort controls. * **Bias:** Like any model, performance and fairness depend on the underlying data. Biases present in the fine-tuning or evaluation datasets may be reflected in the model's behavior. ## Training Data (Summary) The model was fine-tuned on a synthetic dataset specifically curated for automotive function calling tasks. Details will be referenced in the associated publication. ## Citation TBD