ARM Cortex A7 NEON 指令加速
-
转载: NEON 日历 2020
-
NEON模拟器: https://szeged.github.io/nevada/
-
转载: 数组相加
#include <arm_neon.h> #include <iostream> int main(int argc, char** argv) { unsigned char src0[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}; unsigned char src1[] = {100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115}; unsigned char dst [] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; uint8x16_t s0 = vld1q_u8(src0); uint8x16_t s1 = vld1q_u8(src1); uint8x16_t d = vaddq_u8(s0, s1); // ここで足し算 vst1q_u8(dst, d); for( auto i = 0;i < 16;i++) { std::cout << i << '\t' << (int)src0[i] << '\t' << (int)src1[i] << '\t' << (int)dst[i] << std::endl; } return 0; }
编译指令:
arm-linux-gnueabihf-g++ -mfpu=neon -o test1 test1.cpp
执行结果:
0 0 100 100 1 1 101 102 2 2 102 104 3 3 103 106 4 4 104 108 5 5 105 110 6 6 106 112 7 7 107 114 8 8 108 116 9 9 109 118 10 10 110 120 11 11 111 122 12 12 112 124 13 13 113 126 14 14 114 128 15 15 115 130
试了一下,果然好用:
-
-
-
@yuzukitsuru 在 ARM Cortex A7 NEON 指令加速 中说:
ここで足し算
Copyright © 2024 深圳全志在线有限公司 粤ICP备2021084185号 粤公网安备44030502007680号