Struct Qnn_BwAxisScaleOffsetMapped_t¶

Defined in File QnnTypes.h

Struct Documentation¶

struct Qnn_BwAxisScaleOffsetMapped_t¶

A struct to express per-axis quantization parameters as collection of scales, offsets, bitwidth, and mapping.

bitwidth must be > 0 and applies commonly to all axes. It is used to express the true number of bits used to quantize the value, which may be different from the bitwidth of the tensor indicated by its data type. For example: the quantization encoding for a tensor of type QNN_DATATYPE_UFIXED_POINT_8 that is quantized to 4-bit precision may be expressed by setting bitwidth = 4. In such circumstances, data quantized to a lower precision will still occupy the full extent of bits allotted to the tensor as per its data type in unpacked form.

Tensor elements are expected to occupy the least significant bits of the total size alloted to the datatype, and all bits above the specified bitwidth will be ignored. For example: an 8-bit datatype tensor quantized to 4-bit precision will be interpreted as a 4-bit value contained in the lower 4 bits of each element, and the upper 4 bits will be ignored. For signed datatypes, the value will be interpreted as a two’s complement integer where the signed bit is the most significant bit permitted by the specified bitwidth. For example: -3 would be represented as 0b11111101 as a signed 8-bit integer, but can also be represented as 0b00001101 as a signed 4-bit integer stored in an 8-bit container. Either of these representations are valid to express -3 as a 4-bit signed integer in an 8-bit container, and will be treated identically because the upper 4 bits will be ignored.

Public Members

uint32_t bitwidth¶: bitwidth must be <= number of bits specified by data type of tensor

int32_t axis¶

Qnn_QuantizationEncodingMapping_t mapping¶: Specifies mapping from low bitwidth values to quantized values e.g. for custom symmetric encodings bitwidth=2 mapping=QNN_QUANTIZATION_ENCODING_MAPPING_LINEAR_SYMMETRIC_EXCLUDE_ZERO signed values {-2, -1, 0, 1} map to quantized values of {-1.5, -0.5, 0.5, 1.5} such that dequantized_values = scale * {-1.5, -0.5, 0.5, 1.5} Backends are free to manage integer representation at execution time. For the above example, if 4-bit values are used at execution time, the backend may use the mapping {-2, -1, 0, 1} -> {-3, -1, 1, 3} adjusting the scale to scale/2

uint32_t numElements¶: numElements applies to both scales and offsets and they are supposed to be a one-to-one match

float *scales¶: scales must be strictly positive

int32_t *offsets¶: offsets must match scales in their dimension except when it can be NULL to indicate that the value is symmetrically quantized and hence, offset = 0