V853 NPU模型量化过程中，部分网络节点未量化的问题

carpediem LV 4

准备在V853上移植gaitset模型，但在量化过程中遇到了部分网络节点未量化的问题，导致板子上的NPU运行失败。
下面是模型量化的代码

pegasus quantize --model gaitset-sim.json --model-data gaitset-sim.data --batch-size 1 --device CPU --with-input-meta gaitset-sim_inputmeta.yml --rebuild --model-quantize gaitset-sim.quantize --quantizer asymmetric_affine --qtype uint8

量化之后，进行推理，发现输出的log中，部分网络节点没有量化为float32（fake asymmetric_affine），而是float32。由于NPU只能处理uint8，int8，int16这些类型的数据，不能直接处理float32数据，因此我猜测是量化未完全导致了板子上模型运行失败。
下面的是推理过程输出的log，可以看到部分节点未被量化

D Acuity output shape(add): (1 8 128)
D Tensor @Add_Add_306_13:out0 type: float32(fake asymmetric_affine)
D Process Add_Add_320_15 ...
D Acuity output shape(add): (1 16 128)
D Tensor @Add_Add_320_15:out0 type: float32(fake asymmetric_affine)
D Process Concat_Concat_321_5 ...
D Acuity output shape(concat): (1 62 128)
D Tensor @Concat_Concat_321_5:out0 type: float32(fake asymmetric_affine)
D Process Transpose_Transpose_322_3 ...
D Acuity output shape(permute): (62 1 128)
D Tensor @Transpose_Transpose_322_3:out0 type: float32(fake asymmetric_affine)
D Process MatMul_MatMul_323_2 ...
D Acuity output shape(matmul): (62 1 256)
D Tensor @MatMul_MatMul_323_2:out0 type: float32
D Process Transpose_Transpose_324_1 ...
D Acuity output shape(permute): (1 62 256)
D Tensor @Transpose_Transpose_324_1:out0 type: float32
D Process attach_Transpose_Transpose_324/out0_0 ...
D Acuity output shape(output): (1 62 256)
D Tensor @attach_Transpose_Transpose_324/out0_0:out0 type: float32

下面的是在板子上运行NPU模型时出错的log

root@TinaLinux:~# gaitset network_binary.nb 0000.bin
Usage:
    nbg_name input_data1 input_data2...
[0xb6ffa560]vip_init[104],
The version of Viplite is: 1.8.0-0-AW-2022-04-21
Create Neural Network: 24.63ms or 24631.92us
As input, scale=0.003922, zeroPoint=0
data_format=2, num of dimension=4
size=44,64,100,1
data_format=1
, num of dimension=3
size=256,62,1,0
Input size match for 0000.bin, file data size:281600, expected:281600
Start run graph [1] times...
[0xb6ffa560]gcvip_os_call_kernel[344], fail to ioctl vipcore, command[4]:CMD_WAIT, status=-1
[0xb6ffa560]gcvip_user_wait[484], failed to check status=-1
[0xb6ffa560]gcvip_capture_init[1240], catpure file name .//viplite_hang_capture_3cf88a_457_b6ffa560.log
[0xb6ffa560]gcvip_wait_network_segment[801], wait network=gaitset-simprj_NCHW timeout, cmd size=0x9d38, phy=0x48d6e000
[0xb6ffa560]gcvip_wait_network[2975], failed to wait network finish in gcvip wait network
[0xb6ffa560]gcvip_run_network[3021], failed to wait network finish in run network status=-1
Error: main.c: vnn_RunNeuralNetwork at 161
Error: main.c: main at 231
root@TinaLinux:~#

综上所述，请问大家有没有解决方法？

WhycanService LV 8

未量化的节点代表NN矩阵不支持这个节点，是使用PPU进行模拟的