Sample App Tutorial¶

Introduction¶

This tutorial describes how to build a C++ application using QNN APIs that can execute models created using one of the qnn converters on a Linux host or an Android device, and it describes the working of qnn-sample-app.

Warning

The qnn-sample-app is subject to change without notice.

qnn-sample-app is an example C++ application available with the SDK at ${QNN_SDK_ROOT}/examples/QNN/SampleApp. Where QNN_SDK_ROOT is the path to extracted QNN SDK. This tutorial navigates through the source code of qnn-sample-app showing the workflow of usage of QNN APIs to execute a model.

For creating a C++ application based on QNN APIs, we prescribe the below pattern:

Guide for navigating through qnn-sample-app source code¶

Things to note before we proceed to navigate through the source code:

Any return statement returning a value of type StatusCode::xxxx is a reference to enums named StatusCode containing appropriate return codes.
QNN_xxxx macros are used for logging various debug messages.

Loading pre-requisite shared libraries¶

QNN SDK provides various shared libraries to access backends, and applications have to load them as needed to execute a network.

A network in QNN can be created in two ways:

It can be directly built into the user application by using QNN APIs.
QNN converters can be used to produce a shared library of a QNN network.

qnn-sample-app makes use of the second option above. This network can be produced using one of the QNN converters available in the SDK, and further be compiled into a shared library using general/tools:qnn-model-lib-generator.

Note

For Windows users, please replace all ‘.so’ files with the analogous ‘.dll’ file in the following sections. Please refer to Platform Differences for more details.

Loading a backend¶

Shared libraries for various backends including CPU, GPU, HTP, and DSP are available in the QNN SDK. Every backend that implements QNN APIs exposes all necessary symbols that can be accessed using dynamic loading mechanism.

Let’s consider a sample backend shared library named libQnnSampleBackend.so, which can be dynamically loaded as shown below:

 void* libBackendHandle = pal::dynamicloading::dlOpen(
   "libQnnSampleBackend.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL);

 if (nullptr == libBackendHandle) {
   QNN_ERROR("Unable to load backend. pal::dynamicloading::dlError(): %s",
             pal::dynamicloading::dlError());
   return StatusCode::FAIL_LOAD_BACKEND;
 }

To load a model as a shared library, let’s consider a sample model shared library named libQnnSampleModel.so, which can be dynamically loaded as shown below:

 void* libModelHandle = pal::dynamicloading::dlOpen(
     "libQnnSampleModel.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL);

 if (nullptr == libModelHandle) {
   QNN_ERROR("Unable to load model. pal::dynamicloading::dlError(): %s",
             pal::dynamicloading::dlError());
   return StatusCode::FAIL_LOAD_MODEL;
 }

Optionally, to create a context from a cached binary and execute graphs, applications can make use of QnnSystem API to retrieve metadata associated with the context. QnnSystem API can be accessed by loading the libQnnSystem.so shared library as shown below:

 void* systemLibraryHandle = pal::dynamicloading::dlOpen(
   "libQnnSystem.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL);

 if (nullptr == systemLibraryHandle) {
   QNN_ERROR("Unable to load system library. pal::dynamicloading::dlError(): %s",
             pal::dynamicloading::dlError());
   return StatusCode::FAIL_LOAD_SYSTEM_LIB;
 }

Resolving symbols in shared libraries¶

After the shared libraries are successfully loaded, we can proceed to resolve all necessary symbols to access QNN APIs.

The below code snippet shows a template to resolve a symbol in a shared library:

 // A generic function to resolve symbols in a library
 template <class T>
 static inline T resolveSymbol(void* libHandle, const char* symName) {
   T ptr = (T)pal::dynamicloading::dlSym(libHandle, symName);
   if (ptr == nullptr) {
     QNN_ERROR("Unable to access symbol [%s]. pal::dynamicloading::dlError(): %s", symName, pal::dynamicloading::dlError());
   }
   return ptr;
 }

 // Template for resolving a function of type SampleFnHandleType_t
 typedef ReturnType_t (*SampleFnHandleType_t)(FunctionParameterTypes_t ...);
 SampleFnHandleType_t sampleFn = nullptr;
 sampleFnHandle = resolveSymbol<SampleFnHandleType_t>(libBackendHandle, "QnnSample_API");
 if (nullptr == sampleFnHandle) {
   // Error code indicating failure in symbol resolution
   return StatusCode::FAIL_SYM_FUNCTION;
 }

The below code snippet shows an example of how to resolve an actual QNN API:

 /* Resolve the symbol for Qnn_ErrorHandle_t QnnInterface_getProviders(const QnnInterface_t*** providerList,
                                                                       uint32_t* numProviders)
    API */
 typedef Qnn_ErrorHandle_t (*QnnInterfaceGetProvidersFn_t)(const QnnInterface_t*** providerList,
                                                           uint32_t* numProviders);


 QnnInterfaceGetProvidersFn_t getInterfaceProviders {nullptr};

 getInterfaceProviders =
   resolveSymbol<QnnInterfaceGetProvidersFn_t>(libBackendHandle, "QnnInterface_getProviders");
 if (nullptr == getInterfaceProviders) {
   return StatusCode::FAIL_SYM_FUNCTION;
 }

In qnn-sample-app source code, all necessary symbols are resolved and stored in a struct of type QnnFunctionPointers shown below:

typedef struct QnnFunctionPointers {
  // APIs from model output from converters
  // QnnModel_composeGraphs
  ComposeGraphsFnHandleType_t composeGraphsFnHandle;
  // QnnModel_freeGraphsInfo
  FreeGraphInfoFnHandleType_t freeGraphInfoFnHandle;
  // QNN Interface function table containing pointers to all necessary QNN APIs
  // in a backend
  QNN_INTERFACE_VER_TYPE qnnInterface;
  // QNN System Interface function table containing pointers to all QNN System APIs
  QNN_SYSTEM_INTERFACE_VER_TYPE qnnSystemInterface;
} QnnFunctionPointers;

The above structure can be found in ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/src/SampleApp.hpp. The rest of the tutorial will assume a variable named m_qnnFunctionPointers of type QnnFunctionPointers that contains valid function pointers.

Usage of QNN APIs¶

This section demonstrates the usage of QNN APIs in a client application.

Use QNN Interface to obtain function pointers¶

QNN Interface mechanism can be used to set up a table of function pointers to QNN APIs in the backend instead of manually resolving symbols to each and every API, which makes resolving symbols easy. QNN Interface can be used as below:

QnnInterface_t** interfaceProviders{nullptr};
uint32_t numProviders{0};
// Query for al available interfaces
if (QNN_SUCCESS !=
   getInterfaceProviders((const QnnInterface_t***)&interfaceProviders, &numProviders)) {
  QNN_ERROR("Failed to get interface providers.");
  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
}
// Check for validity of returned interfaces
if (nullptr == interfaceProviders) {
  QNN_ERROR("Failed to get interface providers: null interface providers received.");
  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
}
if (0 == numProviders) {
  QNN_ERROR("Failed to get interface providers: 0 interface providers.");
  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
}
bool foundValidInterface{false};
// Loop through all available interface providers and pick the one that suits the current API
// version
for (size_t pIdx = 0; pIdx < numProviders; pIdx++) {
  if (QNN_API_VERSION_MAJOR == interfaceProviders[pIdx]->apiVersion.coreApiVersion.major &&
      QNN_API_VERSION_MINOR <= interfaceProviders[pIdx]->apiVersion.coreApiVersion.minor) {
    foundValidInterface                 = true;
    m_qnnFunctionPointers.qnnInterface = interfaceProviders[pIdx]->QNN_INTERFACE_VER_NAME;
    break;
  }
}
if (!foundValidInterface) {
  QNN_ERROR("Unable to find a valid interface.");
  libBackendHandle = nullptr;
  return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
}

QNN System Interface can be used to resolve all symbols related to QNN System APIs as shown below:

 typedef Qnn_ErrorHandle_t (*QnnSystemInterfaceGetProvidersFn_t)(
   const QnnSystemInterface_t*** providerList, uint32_t* numProviders);

 QnnSystemInterfaceGetProvidersFn_t getSystemInterfaceProviders{nullptr};
 getSystemInterfaceProviders = resolveSymbol<QnnSystemInterfaceGetProvidersFn_t>(
   systemLibraryHandle, "QnnSystemInterface_getProviders");
 if (nullptr == getSystemInterfaceProviders) {
   return StatusCode::FAIL_SYM_FUNCTION;
 }
 QnnSystemInterface_t** systemInterfaceProviders{nullptr};
 uint32_t numProviders{0};
 if (QNN_SUCCESS != getSystemInterfaceProviders(
                      (const QnnSystemInterface_t***)&systemInterfaceProviders, &numProviders)) {
   QNN_ERROR("Failed to get system interface providers.");
   return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
 }
 if (nullptr == systemInterfaceProviders) {
   QNN_ERROR("Failed to get system interface providers: null interface providers received.");
   return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
 }
 if (0 == numProviders) {
   QNN_ERROR("Failed to get interface providers: 0 interface providers.");
   return StatusCode::FAIL_GET_INTERFACE_PROVIDERS;
 }
 bool foundValidSystemInterface{false};
 for (size_t pIdx = 0; pIdx < numProviders; pIdx++) {
   if (QNN_SYSTEM_API_VERSION_MAJOR == systemInterfaceProviders[pIdx]->systemApiVersion.major &&
       QNN_SYSTEM_API_VERSION_MINOR <= systemInterfaceProviders[pIdx]->systemApiVersion.minor) {
     foundValidSystemInterface = true;
     m_qnnFunctionPointers->qnnSystemInterface =
       systemInterfaceProviders[pIdx]->QNN_SYSTEM_INTERFACE_VER_NAME;
     break;
   }
 }

Set up logging¶

Logging can be set up before a backed is initialized and after a backend shared library has been dynamically loaded.

To initialize logging, a callback of type QnnLog_Callback_t has to be defined. An example is defined below:

 void logStdoutCallback(const char* fmt,
                        QnnLog_Level_t level,
                        uint64_t timestamp,
                        va_list argp) {
   const char* levelStr = "";
   switch (level) {
   case QNN_LOG_LEVEL_ERROR:
     levelStr = " ERROR ";
     break;
   case QNN_LOG_LEVEL_WARN:
     levelStr = "WARNING";
     break;
   case QNN_LOG_LEVEL_INFO:
     levelStr = "  INFO ";
     break;
   case QNN_LOG_LEVEL_DEBUG:
     levelStr = " DEBUG ";
     break;
   case QNN_LOG_LEVEL_VERBOSE:
     levelStr = "VERBOSE";
     break;
   case QNN_LOG_LEVEL_MAX:
     levelStr = "UNKNOWN";
     break;
   }
   fprintf(stdout, "%8.1fms [%-7s] ", ms, levelStr);
   vfprintf(stdout, fmt, argp);
   fprintf(stdout, "\n");
 }

The above callback can be registered with the backend along with a maximum log level. Sample code to initialize with a max log level of QNN_LOG_LEVEL_INFO:

 Qnn_LogHandle_t logHandle;
 if (QNN_SUCCESS !=
       m_qnnFunctionPointers.qnnInterface.logCreate(logStdoutCallback, QNN_LOG_LEVEL_INFO, &logHandle)) {
   QNN_ERROR("Unable to initialize logging in the backend.");
   return StatusCode::FAILURE;
 }

Initialize backend¶

Once logging has been successfully initialized, backend can be initialized as shown below:

 Qnn_BackendHandle_t backendHandle;
 const QnnBackend_Config_t* backendConfigs;
 /* Set up any necessary backend configurations */
 if (QNN_BACKEND_NO_ERROR != m_qnnFunctionPointers.qnnInterface.backendCreate(logHandle,
                                                                              &backendConfigs,
                                                                              &backendHandle)) {
   QNN_ERROR("Could not initialize backend");
   return StatusCode::FAILURE;
 }

Initialize Profiling¶

If profiling is desired, after the backend is initialized, a profile handle can be set up. This profile handle can be used at a later point in any API that supports profiling.

A profile handle can be created in the backend with basic profiling level as shown below:

 Qnn_ProfileHandle_t profileHandle;
 if (QNN_PROFILE_NO_ERROR != m_qnnFunctionPointers.qnnInterface.profileCreate(
                                   backendHandle, QNN_PROFILE_LEVEL_BASIC, &profileHandle)) {
   QNN_WARN("Unable to create profile handle in the backend.");
   return StatusCode::FAILURE;
 }

Create device¶

Device can be created as shown below:

Qnn_DeviceHandle_t deviceHandle {nullptr};
const QnnDevice_Config_t* devConfigArray[] = {&devConfig, nullptr};
Qnn_ErrorHandle_t ret = m_qnnFunctionPointers.qnnInterface.deviceCreate(logHandle, devConfigArray, &deviceHandle);
 if (QNN_SUCCESS != ret) {
   QNN_ERROR("Failed to create device: %u", qnnStatus);
   return StatusCode::FAILURE;
 }

Set devConfig as defined here in QNN HTP Backend API

Register op packages¶

Op packages are way to supply libraries containing ops to backends. They can be registered as shown below:

 uint32_t opPackageCount;
 char* opPackagePath[opPackageCount];
 char* opPackageInterfaceProvider[opPackageCount];
 /* Set up required op package paths and interface providers as necessary */
 for(uint32_t idx = 0; idx < opPackageCount; idx++) {
   if (QNN_BACKEND_NO_ERROR !=
         m_qnnFunctionPointers.qnnInterface.backendRegisterOpPackage(backendHandle,
                                                                     opPackagePath[idx],
                                                                     opPackageInterfaceProvider[idx])) {
     QNN_ERROR("Could not register Op Package: %s and interface provider: %s",
             opPackagePath[idx],
             opPackageInterfaceProvider[idx]);
     return StatusCode::FAILURE;
   }
 }

Create context¶

A context can be created in a backend as shown below:

 Qnn_ContextHandle_t context;
 Qnn_DeviceHandle_t deviceHandle {nullptr};
 const QnnContext_Config_t* contextConfigs;
 /* Set up any context configs that are necessary */
 if (QNN_CONTEXT_NO_ERROR !=
       m_qnnFunctionPointers.qnnInterface.contextCreate(backendHandle,
                                                        deviceHandle,
                                                        &contextConfigs,
                                                        &context)) {
   QNN_ERROR("Could not create context");
   return StatusCode::FAILURE;
 }

Prepare graphs¶

qnn-sample-app relies on the output from one of the converters to create a QNN network in the backend. composeGraphsFnHandle is mapped to QnnModel_composeGraphs API in the model shared library, which takes qnn_wrapper_api::GraphInfo_t*** as one of the parameters. The function composeGraphsFnHandle will make necessary calls to the backend to create a network(s). It also writes all necessary information, like information about input and output tensors related to the graph, required to execute a graph into the structure graphsInfo as shown in the following code block:

 /* Structure to retrieve information about graphs, like graph name,
    details about input and output tensors preset in libQnnSampleModel.so */
 qnn_wrapper_api::GraphInfo_t** graphsInfo;
 // No. of graphs present in libQnnSampleModel.so
 uint32_t graphsCount;
 // true to enable intermediate outputs, false for network outputs only
 bool debug;
 if (qnn_wrapper_api::ModelError_t::MODEL_NO_ERROR !=
         m_qnnFunctionPointers.composeGraphsFnHandle(backendHandle,
                                                     m_qnnFunctionPointers.qnnInterface,
                                                     context,
                                                     &graphsInfo,
                                                     &graphsCount,
                                                     debug)) {
   QNN_ERROR("Failed in composeGraphs()");
   return StatusCode::FAILURE;
 }

At this point, the context will contain all the graphs that were present in libQnnSampleModel.so.

Finalize Graphs¶

Graphs that were added in the previous step can be finalized as shown below:

 // information about graphs obtained in the previous step
 qnn_wrapper_api::GraphInfo_t** graphsInfo;
 // No. of graphs obtained in the previous step
 uint32_t graphsCount;
 /* A valid profile handle if profiling is desired,
    nullptr if profiling is not needed */
 Qnn_ProfileHandle_t profileHandle;

 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) {
   if (QNN_GRAPH_NO_ERROR !=
     m_qnnFunctionPointers.qnnInterface.graphFinalize(
         (*graphsInfo)[graphIdx].graph, profileBackendHandle, nullptr)) {
     return StatusCode::FAILURE;
   }
   /* Extract profiling information if desired and if a valid handle was supplied to finalize
      graphs API */
 }

Save context into a binary¶

After all the graphs in a context are finalized, the user application may choose to save the context into a binary for future use. The advantage of saving a context is that it can be retrieved in the future for execution of graphs contained within it without having to finalize them again. This will save considerable time for initialization during execution of a network.

The context can be saved as shown below:

 // Get the expected size of the buffer from the backend in which the context can be saved
 if (QNN_CONTEXT_NO_ERROR !=
   m_qnnFunctionPointers.qnnInterface.contextGetBinarySize(context, &requiredBufferSize)) {
   QNN_ERROR("Could not get the required binary size.");
   return StatusCode::FAILURE;
 }

 // Allocate a buffer of the required size
 saveBuffer = (uint8_t*)malloc(requiredBufferSize * sizeof(uint8_t));
 if (nullptr == saveBuffer) {
   QNN_ERROR("Could not allocate buffer to save binary.");
   return StatusCode::FAILURE;
 }

 auto status = StatusCode::SUCCESS;
 uint32_t writtenBufferSize{0};
 // Pass the allocated buffer and obtain a copy of the context binary written into the buffer
 if (QNN_CONTEXT_NO_ERROR !=
   m_qnnFunctionPointers.qnnInterface.contextGetBinary(context,
                                                       reinterpret_cast<void*>(saveBuffer),
                                                       requiredBufferSize,
                                                       &writtenBufferSize)) {
  QNN_ERROR("Could not get binary.");
  status = StatusCode::FAILURE;
 }

 // Check if the supplied buffer size is at least as big as the amount of data witten by the backend
 if (requiredBufferSize < writtenBufferSize) {
   QNN_ERROR(
     "Illegal written buffer size [%d] bytes. Cannot exceed allocated memory of [%d] bytes",
     writtenBufferSize,
     requiredBufferSize);
   status = StatusCode::FAILURE;
 }

 // Use caching utility to save metadata along with the binary buffer from the backend
 if (status == StatusCode::SUCCESS &&
   tools::datautil::StatusCode::SUCCESS != tools::datautil::writeBinaryToFile(outputPath,
                                                                              saveBinaryName + ".bin",
                                                                              (uint8_t*)saveBuffer,
                                                                              writtenBufferSize)) {
   QNN_ERROR("Could not serialize to file.");
   status = StatusCode::FAILURE;
 }

Load context from a cached binary¶

A context that was saved into a binary, like in the previous step, can be loaded as an alternative to creating a new context every time. The code snippet below demonstrates this step:

 auto returnStatus   = StatusCode::SUCCESS;
 std::shared_ptr<uint8_t> buffer{nullptr};
 uint32_t graphsCount {0};
 buffer = std::shared_ptr<uint8_t>(new uint8_t[bufferSize], std::default_delete<uint8_t[]>());
 if (!buffer) {
     QNN_ERROR("Failed to allocate memory.");
     return StatusCode::FAILURE;
 }

 if (tools::datautil::StatusCode::SUCCESS !=
     tools::datautil::readBinaryFromFile(
     cachedBinaryPath, reinterpret_cast<uint8_t*>(buffer.get()), bufferSize)
     QNN_ERROR("Failed to read binary file.");
     returnStatus = StatusCode::FAILURE;
 }

 /* Create a QnnSystemContext handle to access system context APIs. */
 QnnSystemContext_Handle_t sysCtxHandle{nullptr};
 if (QNN_SUCCESS != m_qnnFunctionPointers.qnnSystemInterface.systemContextCreate(&sysCtxHandle)) {
   QNN_ERROR("Could not create system handle.");
   returnStatus = StatusCode::FAILURE;
 }

 /* Retrieve metadata from the context binary through QNN System Context API. */
 QnnSystemContext_BinaryInfo_t* binaryInfo{nullptr};
 uint32_t binaryInfoSize{0};
 if (StatusCode::SUCCESS == returnStatus &&
     QNN_SUCCESS != m_qnnFunctionPointers.qnnSystemInterface.systemContextGetBinaryInfo(
                      sysCtxHandle,
                      static_cast<void*>(buffer.get()),
                      bufferSize,
                      &binaryInfo,
                      &binaryInfoSize)) {
     QNN_ERROR("Failed to get context binary info");
     returnStatus = StatusCode::FAILURE;
 }

 qnn_wrapper_api::GraphInfo_t** graphsInfo;
 /* Make a copy of the metadata. */
 if (StatusCode::SUCCESS == returnStatus &&
     !copyMetadataToGraphsInfo(binaryInfo, graphsInfo, graphsCount)) {
   QNN_ERROR("Failed to copy metadata.");
   returnStatus = StatusCode::FAILURE;
 }

 /* Release resources associated with previously created QnnSystemContext handle. */
 m_qnnFunctionPointers.qnnSystemInterface.systemContextFree(sysCtxHandle);
 sysCtxHandle = nullptr;

 /* readBuffer contains the binary data that was previously obtained from a backend. Pass this
    cached binary data to the backend to recreate the same context. */
 if (StatusCode::SUCCESS == returnStatus &&
     m_qnnFunctionPointers.qnnInterface.contextCreateFromBinary(backendHandle,
                                                                deviceHandle,
                                                                (const QnnContext_Config_t**)&contextConfig,
                                                                reinterpret_cast<void*>(readBuffer),
                                                                bufferSize,
                                                                &context,
                                                                profileBackendHandle)) {
   QNN_ERROR("Could not create context from binary.");
   returnStatus = StatusCode::FAILURE;
 }

 // Optionally, extract profiling numbers if desired
 if (ProfilingLevel::OFF != m_profilingLevel) {
   extractBackendProfilingInfo(profileBackendHandle);
 }

 /* Obtain and save graph handles for each graph present in the context based on the saved graph
    names in the metadata */
 if (StatusCode::SUCCESS == returnStatus) {
   for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) {
     if (QNN_SUCCESS !=
         m_qnnFunctionPointers.qnnInterface.graphRetrieve(
             context, (*graphsInfo)[graphIdx].graphName, &((*graphsInfo)[graphIdx].graph))) {
       QNN_ERROR("Unable to retrieve graph handle for graph Idx: %d", graphIdx);
       returnStatus = StatusCode::FAILURE;
     }
   }
 }

Execute graphs¶

After a context has been created, graphs have been added and finalized, or alternatively, after a context has been retrieved from a binary, one or more graphs in the context can be executed.

Executing a graph involves:

Setting up input and output tensors.

Populating input data into input tensors.

Calling the execute method in the backend.

Obtaining outputs and saving them.

This is demonstrated using the code snippet below:

 // Select a graph from graphsInfo if there are more than one graph in this context
 uint32_t graphIdx;
 QNN_DEBUG("Starting execution for graphIdx: %d", graphIdx);
 Qnn_Tensor_t* inputs  = nullptr;
 Qnn_Tensor_t* outputs = nullptr;
 // IOTensor utility is used to set up input and output tensor structures
 if (iotensor::StatusCode::SUCCESS !=
       ioTensor.setupInputAndOutputTensors(&inputs, &outputs, (*graphsInfo)[graphIdx])) {
   QNN_ERROR("Error in setting up Input and output Tensors for graphIdx: %d", graphIdx);
   returnStatus = StatusCode::FAILURE;
   break;
 }

 // Grab input raw file paths to read input data
 auto inputFileList = inputFileLists[graphIdx];
 auto graphInfo     = (*graphsInfo)[graphIdx];
 if (!inputFileList.empty()) {
   /* *qnn-sample-app* reads data based on the batch size until the whole buffer is filled.
      If there isn't sufficient data, it pads the rest with zeroes. */
   size_t totalCount = inputFileList[0].size();
   while (!inputFileList[0].empty()) {
     size_t startIdx = (totalCount - inputFileList[0].size());

     // IOTensor utility is used to populate input tensors with input data
     if (iotensor::StatusCode::SUCCESS !=
           m_ioTensor.populateInputTensors(
             graphIdx, inputFileList, inputs, graphInfo, inputDataType)) {
       returnStatus = StatusCode::FAILURE;
     }

     if (StatusCode::SUCCESS == returnStatus) {
       // Execute the graph in the backend with optional profile handle
       QNN_DEBUG("Successfully populated input tensors for graphIdx: %d", graphIdx);
       Qnn_ErrorHandle_t executeStatus = QNN_GRAPH_NO_ERROR;
       executeStatus = m_qnnFunctionPointers.qnnInterface.graphExecute(graphInfo.graph,
                                                                       inputs,
                                                                       graphInfo.numInputTensors,
                                                                       outputs,
                                                                       graphInfo.numOutputTensors,
                                                                       profileBackendHandle,
                                                                       nullptr);
       if (QNN_GRAPH_NO_ERROR != executeStatus) {
         returnStatus = StatusCode::FAILURE;
       }
       if (StatusCode::SUCCESS == returnStatus) {
         QNN_DEBUG("Successfully executed graphIdx: %d ", graphIdx);
         // IOTensor utility is used to write output tensors to raw files
         if (iotensor::StatusCode::SUCCESS !=
               ioTensor.writeOutputTensors(graphIdx,
                                           startIdx,
                                           graphInfo.graphName,
                                           outputs,
                                           graphInfo.outputTensors,
                                           graphInfo.numOutputTensors,
                                           outputDataType,
                                           graphsCount,
                                           outputPath)) {
             returnStatus = StatusCode::FAILURE;
           }
         }
       }
       if (StatusCode::SUCCESS != returnStatus) {
         QNN_ERROR("Execution of Graph: %d failed!", graphIdx);
         break;
       }
     }
   }

   // Clean up all the tensors after execution is completed
   ioTensor.tearDownInputAndOutputTensors(
       inputs, outputs, graphInfo.numInputTensors, graphInfo.numOutputTensors);
   inputs  = nullptr;
   outputs = nullptr;
   if (StatusCode::SUCCESS != returnStatus) {
     break;
   }
 }

IOTensor is a utility provided with the source code at ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/src/Utils/IOTensor.cpp. It exposes a few methods that help with the execution of a graph, which were used in the previous code snippet:

setupInputAndOutputTensors to set up structures related to input and output tensors.

populateInputTensors to copy input data into input tensor structures.

tearDownInputAndOutputTensors to clean up resources associated with input and output tensors.

Refer to the IOTensor source code for more details about these APIs.

Free context¶

After all the execution is completed, the context can be freed as shown below:

 if (QNN_CONTEXT_NO_ERROR !=
       m_qnnFunctionPointers.qnnInterface.contextFree(context, profileBackendHandle)) {
   QNN_ERROR("Could not free context");
   return StatusCode::FAILURE;
 }

Terminate backend¶

Backend can be terminated as shown below:

 if (QNN_BACKEND_NO_ERROR != m_qnnFunctionPointers.qnnInterface.backendFree(backendHandle)) {
   QNN_ERROR("Could not free backend");
   return StatusCode::FAILURE;
 }

Building and running qnn-sample-app¶

Setup¶

Linux

Building qnn-sample-app has two external dependencies:

clang compiler
ndk-build (for Android targets only)

If the clang compiler is not available in your system PATH, the script ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh provided with the SDK can be used to install and prepare your environment. Alternatively, you could install these dependencies and make them available in your PATH.

Command to automatically install required dependencies:

1 $ sudo bash ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh

For the second dependency to be satisfied, ndk-build needs to be set using general/setup:Compiler Toolchains

1 $ ${QNN_SDK_ROOT}/bin/envcheck -n

Note: qnn-sample-app has been verified to work with Android NDK version r25c.

GCC Toolchain

For building qnn-sample-app to run on devices with Yocto based OS, gcc compiler is needed. To support Yocto Kirkstone based devices, the SDK libraries are compiled with GCC-11.2. Following section provides steps to acquire the toolchain taking Yocto Kirkstone as an example.

If the required compiler is not available in your system PATH, please use the below steps to install the dependency and make them available in your PATH.

Please follow Qualcomm build guide to generate the eSDK that contains cross compiler toolchain (qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh) required to build sample application.

Steps to build the eSDK are available at https://docs.qualcomm.com/bundle/publicresource/topics/80-70020-254/how_to.html#generate-an-esdk

After building the eSDK, qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh will be generated at <WORKSPACE DIR>/build-qcom-wayland/tmp-glibc/deploy/sdk.

Extract toolchain using ./qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh

Windows

The tutorial assumes general setup instructions have been followed at Setup. Please use “Developer PowerShell for VS 2022” in the following steps.

Hexagon

Building libQnnSampleApp.so has one external dependencies:

hexagon sdk

HEXAGON_SDK_ROOT is the path of the Hexagon SDK installation. Refer HTP and DSP to setup hexagon sdk.

To setup environment:

1 $ source ${HEXAGON_SDK_ROOT}/setup_sdk_env.source

Build¶

Once the setup is complete, qnn-sample-app can be built as follows:

Linux

 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
 $ make all_x86 all_android

After executing make from above, you should be able to see two new folders in the same directory:

bin: contains qnn-sample-app binaries for each platform within respective directories.

obj: contains all the object files that were used for building and linking the executable.

To delete all the artifacts that were generated in the above step, run:

 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
 $ make clean

Linux (Yocto Based)

For those devices which have Yocto based Linux OS, GCC compiler needs to be used to build the sample source code. To support Yocto Kirkstone based devices, libraries are compiled with gcc11.2. Please refer below steps for building QNN sample app:

 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/
 $ export QNN_AARCH64_LINUX_OE_GCC_112=/path/to/extracted/toolchain
 $ make CXX="<installed_toolchain_path>/tmp/sysroots/x86_64/usr/bin/aarch64-qcom-linux/aarch64-qcom-linux-g++
   --sysroot=<installed_toolchain_path>/tmp/sysroots/qcm6490" all_linux_oe_aarch64_gcc112

After executing make from above, you should be able to see two new folders in the same directory:

bin: contains qnn-sample-app binaries for each platform within respective directories.

obj: contains all the object files that were used for building and linking the executable.

To delete all the artifacts that were generated in the above step, run:

 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
 $ make clean

Windows

Warning

AsyncExecution and MultiCore features are not supported on the Windows platform. Running sample apps with these features might fail at runtime.

 $ cd $QNN_SDK_ROOT/examples/QNN/SampleApp/SampleApp
 $ mkdir build
 $ cd build
 $ cmake ../ -A [x64, ARM64]
 $ cmake --build ./ --config Release

After executing commands from above, you should be able to see $QNN_SDK_ROOT/examples/QNN/SampleApp/SampleApp/build/src/Release/qnn-sample-app.exe

Hexagon

Assuming user desired Hexagon architecture version is v69. To build QnnSampleApp for hexagon use below command

 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
 $ make hexagon V=v69

After executing make from above, you should be able to see two new folders in the same directory:

bin: contains libQnnSampleAppv69.so shared library in hexagon directory.

obj: contains all the object files that were used for building and linking the executable.

To delete the artifacts that were generated in the above step, run:

 $ cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp
 $ make clean_hexagon

Run¶

Linux

qnn-sample-app executable generated in the build step can be used to execute a model using any QNN backend available for linux-x86_64 and aarch64-android. It is very similar to executing qnn-net-run, except when retrieving a context from a cached binary. To retrieve a cached context, qnn-sample-app additionally needs the QNN System library (libQnnSystem.so) to extract metadata, and it can be provided through the –system_library option. libQnnSystem.so can be found in the SDK for a particular target under lib/<target> folder. Refer to further documentation on qnn-net-run here.

For example, let’s consider execution of the shallow model on CPU backend on a Linux host from Tutorial 1. Replacing qnn-net-run with qnn-sample-app should produce same results:

$ cd ${QNN_SDK_ROOT}/examples/QNN/converter/models # access input data
$ ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/x86_64-linux-clang/qnn-sample-app \
              --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnCpu.so \
              --model ${QNN_SDK_ROOT}/examples/QNN/example_libs/x86_64-linux-clang/libqnn_model_float.so \
              --input_list ${QNN_SDK_ROOT}/examples/QNN/converter/models/input_list_float.txt \
              --op_packages ${QNN_SDK_ROOT}/examples/QNN/OpPackage/CPU/libs/x86_64-linux-clang/libQnnCpuOpPackageExample.so:QnnOpPackage_interfaceProvider

For more tool help, run:

1 $ qnn-sample-app --help

Linux (Yocto Based)

qnn-sample-app executable generated in the build step can be used to execute a model using any QNN backend. To support Yocto kirkstone based devices, backends are available for aarch64-oe-linux-gcc11.2. To run the executable, please refer same steps as LINUX section above.

For more tool help, run:

1 $ qnn-sample-app --help

Windows

Warning

AsyncExecution and MultiCore features are not supported on the Windows platform. Running sample apps with these features might fail at runtime.

qnn-sample-app.exe executable generated in the build step can be used to execute a model using any QNN backend available for windows-x86_64 and aarch64-windows platforms. It is very similar to executing qnn-net-run. Refer to the general QNN documentation available at here to see how to run qnn-net-run. Simply replacing qnn-net-run with qnn-sample-app.exe in the tutorials should help.

For example, let’s consider execution of the Inception_v3 model on CPU backend on a Windows host from Converting and executing a CNN model with QNN. Replacing qnn-net-run with qnn-sample-app.exe should produce same results:

$ & "<QNN_SDK_ROOT>/bin/envsetup.ps1"
$ cd $QNN_SDK_ROOT/examples/QNN/converter/models
$ qnn-sample-app.exe \
              --backend QnnCpu.dll \
              --model Inception_v3.dll \
              --input_list $QNN_SDK_ROOT/examples/QNN/converter/models/input_list_float.txt

For more tool help, run:

1 $ qnn-sample-app.exe --help

Hexagon

libQnnSampleApp69.so shared library generated in the build step can be used to execute a model using QNN backend available for same hexagon architecture. It is executed using run_main_on_hexagon on device.

DEVICE_PATH refer to path on device where required files are pushed.

Push required file on device (Below command are for android device only)

$ adb push ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/android_aarch64/run_main_on_hexagon /vendor/bin/run_main_on_hexagon
$ adb push ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/hexagon_toolv87_v69/librun_main_on_hexagon_skel.so /vendor/lib/rfsa/adsp/
$ adb push ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/hexagon/libQnnSampleAppv69.so ${DEVICE_PATH}
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v69/unsigned/libQnnHtpV69.so ${DEVICE_PATH}
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v69/unsigned/libQnnSystem.so ${DEVICE_PATH}
$ adb push qnnmodel.serialized.bin ${DEVICE_PATH}
$ adb push input_list.txt ${DEVICE_PATH}

Run below command to execute QnnSampleApp on device

Note

run_main_on_hexagon require specification of the DSP domain on which to offload the program, in Hexagon QnnSampleApp case will use cDSP domain which is expressed by numeric domain id 3

$ cd /vendor/bin
$ ./run_main_on_hexagon 3 ${DEVICE_PATH}/libQnnSampleAppv69.so \
               --backend ${DEVICE_PATH}/libQnnHtpV69.so \
               --system_library ${DEVICE_PATH}/libQnnSystem.so \
               --retrieve_context ${DEVICE_PATH}/qnnmodel.serialized.bin \
               --input_list  ${DEVICE_PATH}/input_list.txt

Building and Running LPAI qnn-sample-app¶

Overview
Reference Implementation
Prerequisites
Environment Setup
Compilation
- Linux Build
- Hexagon Build
Execution
- Linux Execution
- Hexagon Execution

Overview ¶

This guide provides step-by-step instructions for building and running the qnn-sample-app using the LPAI backend. The sample demonstrates how to integrate and execute the QNN LPAI backend on both Linux and Hexagon DSP platforms. It includes environment setup, compilation, and execution procedures tailored for different targets.

Reference Implementation ¶

The reference implementation is located at:

${QNN_SDK_ROOT}/examples/Qnn/SampleApp/SampleAppLPAI

This directory contains source code and build scripts for the sample application. It serves as a practical example of how to use the LPAI backend with QNN SDK.

Prerequisites ¶

Before proceeding, ensure the following prerequisites are met:

General Requirements

QNN SDK installed and QNN_SDK_ROOT environment variable set.
Hexagon SDK installed and HEXAGON_SDK_ROOT environment variable set.
Android NDK (required for Android targets).
GCC toolchain (required for Yocto-based Linux targets).
Clang compiler (required for x86 Linux builds).
adb tool (for pushing binaries to Android devices).
Root access on target device (for Hexagon execution).

Environment Setup ¶

Linux Setup ¶

To build qnn-sample-app on Linux, you need:

Clang Compiler
ndk-build (only for Android targets)

If Clang is not available in your system PATH, use the SDK-provided script to install it:

$ sudo bash ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh

To verify the environment setup:

$ ${QNN_SDK_ROOT}/bin/envcheck -n

Note

qnn-sample-app has been verified with Android NDK version r25c.

GCC Toolchain (Yocto)¶

For Yocto-based Linux targets, the GCC compiler is required. The QNN SDK libraries are compiled using GCC 11.2 to support Yocto Kirkstone.

To obtain the correct toolchain:

Follow Qualcomm’s build guide: https://docs.qualcomm.com/bundle/publicresource/topics/80-70020-254/how_to.html#generate-an-esdk
After building the eSDK, locate the toolchain script:

<WORKSPACE DIR>/build-qcom-wayland/tmp-glibc/deploy/sdk/qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh

Extract the toolchain:

./qcom-wayland-x86_64-qcom-console-image-armv8-2a-qcm6490-toolchain-ext-0.0.sh

Hexagon Setup ¶

To build for Hexagon DSP, the Hexagon SDK must be installed and configured.

Set up the environment:

$ source ${HEXAGON_SDK_ROOT}/setup_sdk_env.source

Refer to general/setup/linux_setup:LPAI (Low Power AI) for detailed setup instructions.

Compilation ¶

Linux Build ¶

To build for x86 Linux:

cd ${QNN_SDK_ROOT}/examples/Qnn/SampleApp/SampleAppLPAI
make all_x86

Hexagon Build ¶

To build for Hexagon DSP (e.g., v79 architecture):

cd ${QNN_SDK_ROOT}/examples/Qnn/SampleApp/SampleAppLPAI
make hexagon V=v79

Execution ¶

Linux Execution ¶

Set the library path and run the sample app:

export LD_LIBRARY_PATH=${QNN_SDK_ROOT}/lib/x86_64-linux-clang
${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleAppLPAI/bin/x86_64-linux-clang/qnn-sample-app \
    --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnLpai.so \
    --retrieve_context qnnmodel.serialized.bin \
    --systemlib ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnSystem.so

Hexagon Execution ¶

Note

The run_main_on_hexagon utility is used to offload execution to the Hexagon DSP. In this case, the aDSP domain is used, which corresponds to domain ID 0. The unsigned_pd=0 argument specifies the use of a signed PD.

Note

To execute the LPAI backend on an Android device, the following conditions must be met:

The following Hexagon artifacts must be signed by the client:
- ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/hexagon_toolv19_v79/librun_main_on_hexagon_skel.so
- ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/hexagon/libQnnSampleAppv79.so
- ${QNN_SDK_ROOT}/lib/lpai-v6/unsigned/libQnnLpai.so
- ${QNN_SDK_ROOT}/lib/hexagon-v79/unsigned/libQnnSystem.so

Note

The run_main_on_hexagon binary from the Hexagon SDK should not be signed.

qnn-sample-app must be executed with root permissions.

Steps to Execute:

DEVICE_PATH - refers to path on device where required files are pushed.

Push binaries to the device:

adb push ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/android_aarch64/run_main_on_hexagon /vendor/bin/run_main_on_hexagon
adb push ${HEXAGON_SDK_ROOT}/libs/run_main_on_hexagon/ship/hexagon_toolv19_v79/librun_main_on_hexagon_skel.so /vendor/lib/rfsa/adsp/
adb push ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/hexagon/libQnnSampleAppv79.so ${DEVICE_PATH}
adb push ${QNN_SDK_ROOT}/lib/lpai-v6/unsigned/libQnnLpai.so ${DEVICE_PATH}
adb push ${QNN_SDK_ROOT}/lib/hexagon-v79/unsigned/libQnnSystem.so ${DEVICE_PATH}
adb push qnnmodel.serialized.bin ${DEVICE_PATH}

Run the sample app:

cd /vendor/bin
./run_main_on_hexagon 0 ${DEVICE_PATH}/libQnnSampleAppv79.so \
    unsigned_pd=0 \
    --backend ${DEVICE_PATH}/libQnnLpai.so \
    --systemlib ${DEVICE_PATH}/libQnnSystem.so \
    --retrieve_context ${DEVICE_PATH}/qnnmodel.serialized.bin