YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Phi-3.5-Mini

Run Phi-3.5-Mini optimized for Qualcomm NPUs with nexaSDK.

Quickstart

Install nexaSDK and create a free account at sdk.nexa.ai

Activate your device with your access token:

nexa config set license '<access_token>'

Run the model on Qualcomm NPU in one line:
```
nexa infer NexaAI/phi3.5-mini-npu
```

Model Description

Phi-3.5-Mini is a ~3.8B-parameter instruction-tuned language model from Microsoft’s Phi family. It’s designed to deliver strong reasoning and instruction-following quality within a compact footprint, making it ideal for on-device and latency-sensitive applications. This Turbo build uses Nexa’s Qualcomm NPU path for faster inference and higher throughput while preserving model quality.

Features

Lightweight yet capable: strong performance with small memory and compute budgets.
Conversational AI: context-aware dialogue for assistants and agents.
Content generation: drafting, completion, summarization, code comments, and more.
Reasoning & analysis: math/logic step-by-step problem solving.
Multilingual: supports understanding and generation across multiple languages.
Customizable: fine-tune or apply adapters for domain-specific use.

Use Cases

Personal and enterprise chatbots
On-device AI applications and offline assistants
Document/report/email summarization
Education and tutoring tools
Vertical solutions (e.g., healthcare, finance, legal), with proper guardrails

Inputs and Outputs

Input:

Text prompts or conversation history (tokenized input sequences).

Output:

Generated text: responses, explanations, or creative content.
Optionally: raw logits/probabilities for advanced downstream tasks.

License

Licensed under: MIT License

References

Downloads last month: 21

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/phi3.5-mini-npu

Qualcomm NPU

Collection

Latest SOTA models supported on Qualcomm NPU. • 13 items • Updated about 4 hours ago • 1