You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ONNX Runtime C++ Example for Nomic-embed-text-v1.5 on Ryzen AI NPU

This project demonstrates how to run ONNX models using ONNX Runtime with C++ on AMD Ryzen AI NPU hardware. The application compares performance between CPU and NPU execution when the system configuration supports it.

Prerequisites

Software Requirements

  • Ryzen AI 1.4 (RAI 1.4) - AMD's AI acceleration software stack
  • CMake (version 3.15 or higher)
  • Visual Studio 2022 with C++ development tools
  • Python/Conda environment with Ryzen AI 1.4 installed

Hardware Requirements

  • AMD Ryzen processor with integrated NPU (Phoenix or Hawk Point architecture)

Environment Variables

Before building and running the application, ensure the following environment variables are properly configured:

  • XLNX_VART_FIRMWARE: Path to the Xilinx VART firmware directory
  • RYZEN_AI_INSTALLATION_PATH: Path to your Ryzen AI 1.4 installation directory

These variables are typically set during the Ryzen AI 1.4 installation process. If not set, the NPU execution will fail.

Build Instructions

  1. Activate the Ryzen AI environment:

    conda activate <your-rai-environment-name>
    
  2. Build the project:

    compile.bat
    

The build process will generate the executable in the build\Release directory along with all necessary dependencies.

Usage

By default, the model will be run on CPU followed by NPU.

Navigate to the build output directory and run the application:

Basic Example

cd build\Release
quicktest.exe -m <model_name> -c <configuration_file_name> --cache_dir <directory_containing_model_cache> --cache_key <name_of_cache_directory> -i <number_of_iters>

To run NOMIC using the pre-built Model Cache

Using the prebuild cache will elminate model compilation, which can last several minutes To use the existing nomic_model_cache directory for faster startup, run:

cd build\Release
quicktest.exe -m ..\..\nomic_bf16.onnx -c vaiml_config.json --cache_dir . --cache_key modelcachekey -i 5

This example:

  • Uses the pre-compiled model cache in nomic_model_cache for faster inference initialization
  • Runs 5 iterations to better demonstrate performance differences between CPU and NPU

Command Line Options

Option Long Form Description
-m --model Path to the ONNX model file
-c --config Path to the VitisAI configuration JSON file
-d --cache_dir Directory path for model cache storage
-k --cache_key Path to the cache key directory
-i --iters Number of inference iterations to execute

Project Structure

  • src/main.cpp - Main application entry point
  • src/npu_util.cpp/h - NPU utility functions and helpers
  • src/cxxopts.hpp - Command-line argument parsing library
  • nomic_bf16.onnx - Sample ONNX model (bf16 precision)
  • vaiml_config.json - VitisAI EP configuration file
  • CMakeLists.txt - CMake build configuration

Notes

  • The application automatically detects NPU availability and falls back to CPU execution if the NPU is not accessible
  • Model caching is used to improve subsequent inference performance
  • The included cxxopts header library provides robust command-line argument parsing
  • Ensure your conda environment is activated before building to access the necessary Ryzen AI libraries
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including amd/NPU-Nomic-embed-text-v1.5-ryzen-strix-cpp