Struct Qnn_BwBlockMapped_t¶

Defined in File QnnTypes.h

Struct Documentation¶

struct Qnn_BwBlockMapped_t¶

A struct to express block quantization parameters. A tensor is divided into blocks of size blockSize, where blockSize is an array of length rank.

Note

num of scaleOffsets (i.e. num of blocks) must be == ceil(dimensions[0]/blockSize[0])*ceil(dimensions[1]/blockSize[1]) … …. *ceil(dimensions[rank-1] / blockSize[rank-1]). *

Public Members

uint32_t bitwidth¶: bitwidth must be <= number of bits specified by data type of tensor

Qnn_QuantizationEncodingMapping_t mapping¶: Specifies mapping from low bitwidth values to quantized values e.g. for custom symmetric encodings bitwidth=2 mapping=QNN_QUANTIZATION_ENCODING_MAPPING_LINEAR_SYMMETRIC_EXCLUDE_ZERO signed values {-2, -1, 0, 1} map to quantized values of {-1.5, -0.5, 0.5, 1.5} such that dequantized_values = scale * {-1.5, -0.5, 0.5, 1.5} Backends are free to manage integer representation at execution time. For the above example, if 4-bit values are used at execution time, the backend may use the mapping {-2, -1, 0, 1} -> {-3, -1, 1, 3} adjusting the scale to scale/2

uint32_t *blockSize¶: Dimensions of the block in number of tensor elements. Pointer to an array of size RANK(Weight). Each element specifies the size along the corresponding dimension

Qnn_ScaleOffset_t *scaleOffset¶: Array of size numBlocks of scale offset pairs.