导航

    全志在线开发者论坛

    • 注册
    • 登录
    • 搜索
    • 版块
    • 话题
    • 在线文档
    • 社区主页
    1. 主页
    2. yao0718
    3. 最佳
    Y
    • 资料
    • 关注 0
    • 粉丝 0
    • 我的积分 247
    • 主题 1
    • 帖子 4
    • 最佳 2
    • 群组 0

    yao0718 发布的最佳帖子

    • 回复: T113-S3 ARM 及 DSP Benchmark

      xtensa-hifidsp-gcc.tar.bz2
      仅用于学习及研究目的,不提供任何保证,商业用途请联系全志购买candence explore

      C库的头文件已经被替换为SDK中XCLIB的头文件,如要使用newlib请自行重新编译安装newlib

      发布在 其它全志芯片讨论区
      Y
      yao0718
    • T113-S3 ARM 及 DSP Benchmark

      ARM 双核SMP ,系统 rtems5,主频 1.08G

       [/] # coremark
      2K performance run parameters for coremark.
      CoreMark Size    : 666
      Total ticks      : 12680
      Total time (secs): 12.680000
      Iterations/Sec   : 3154.574132
      Iterations       : 40000
      Compiler version : GCC7.5.0 20191114 (RTEMS 5, RSB 5.not_released, Newlib 7947581)
      Compiler flags   : -O2
      Memory location  : Please put data memory location here
                              (e.g. code in flash, data on heap etc)
      seedcrc          : 0xe9f5
      [0]crclist       : 0xe714
      [0]crcmatrix     : 0x1fd7
      [0]crcstate      : 0x8e3a
      [0]crcfinal      : 0x25b5
      Correct operation validated. See README.md for run and reporting rules.
      CoreMark 1.0 : 3154.574132 / GCC7.5.0 20191114 (RTEMS 5, RSB 5.not_released, Newlib 7947581) -O2 / Heap
       [/] # dhry 10000000
      
      Dhrystone Benchmark, Version 2.1 (Language: C)
      
      Program compiled without 'register' attribute
      
      Execution starts, 10000000 runs through Dhrystone
      Execution ends
      
      Final values of the variables used in the benchmark:
      
      Int_Glob:            5
              should be:   5
      Bool_Glob:           1
              should be:   1
      Ch_1_Glob:           A
              should be:   A
      Ch_2_Glob:           B
              should be:   B
      Arr_1_Glob[8]:       7
              should be:   7
      Arr_2_Glob[8][7]:    10000010
              should be:   Number_Of_Runs + 10
      Ptr_Glob->
        Ptr_Comp:          1080941968
              should be:   (implementation-dependent)
        Discr:             0
              should be:   0
        Enum_Comp:         2
              should be:   2
        Int_Comp:          17
              should be:   17
        Str_Comp:          DHRYSTONE PROGRAM, SOME STRING
              should be:   DHRYSTONE PROGRAM, SOME STRING
      Next_Ptr_Glob->
        Ptr_Comp:          1080941968
              should be:   (implementation-dependent), same as above
        Discr:             0
              should be:   0
        Enum_Comp:         1
              should be:   1
        Int_Comp:          18
              should be:   18
        Str_Comp:          DHRYSTONE PROGRAM, SOME STRING
              should be:   DHRYSTONE PROGRAM, SOME STRING
      Int_1_Loc:           5
              should be:   5
      Int_2_Loc:           13
              should be:   13
      Int_3_Loc:           7
              should be:   7
      Enum_Loc:            1
              should be:   1
      Str_1_Loc:           DHRYSTONE PROGRAM, 1'ST STRING
              should be:   DHRYSTONE PROGRAM, 1'ST STRING
      Str_2_Loc:           DHRYSTONE PROGRAM, 2'ND STRING
              should be:   DHRYSTONE PROGRAM, 2'ND STRING
      
      Microseconds for one run through Dhrystone:    0.3 
      Dhrystones per Second:                      3527884.4 
      DMIPS:                                      2007.90 
      
       [/] # linpack 100
      Rolled Double Precision Linpack Benchmark - PC Version in 'C/C++'
      
      Compiler     rtems arm-rtems5-gcc 7.5.0 20191114
      Optimisation -O2
      
      norm resid      resid           machep         x[0]-1          x[n-1]-1
         1.7    7.41628980e-14   2.22044605e-16  -1.49880108e-14  -1.89848137e-14
      
      Times are reported for matrices of order          100
      1 pass times for array with leading dimension of  201
      
            dgefa      dgesl      total     Mflops       unit      ratio
          0.00519    0.00016    0.00535     128.35     0.0156     0.0955
      
      Calculating matgen overhead
             100 times   0.04 seconds
            1000 times   0.37 seconds
            2000 times   0.74 seconds
            4000 times   1.48 seconds
            8000 times   2.96 seconds
           16000 times   5.92 seconds
      Overhead for 1 matgen      0.00037 seconds
      
      Calculating matgen/dgefa passes for 5 seconds
             100 times   0.55 seconds
             200 times   1.09 seconds
             400 times   2.18 seconds
             800 times   4.36 seconds
            1600 times   8.72 seconds
      Passes used        917 
      
      Times for array with leading dimension of 201
      
            dgefa      dgesl      total     Mflops       unit      ratio
          0.00508    0.00016    0.00524     131.12     0.0153     0.0935
          0.00508    0.00016    0.00524     131.12     0.0153     0.0935
          0.00508    0.00016    0.00524     131.11     0.0153     0.0935
          0.00508    0.00016    0.00524     131.13     0.0153     0.0935
          0.00508    0.00016    0.00524     131.13     0.0153     0.0935
      Average                               131.12
      
      Calculating matgen2 overhead
      Overhead for 1 matgen      0.00037 seconds
      
      Times for array with leading dimension of 200
      
            dgefa      dgesl      total     Mflops       unit      ratio
          0.00508    0.00016    0.00523     131.22     0.0152     0.0934
          0.00508    0.00016    0.00523     131.22     0.0152     0.0934
          0.00508    0.00016    0.00523     131.21     0.0152     0.0935
          0.00508    0.00016    0.00523     131.21     0.0152     0.0935
          0.00508    0.00016    0.00523     131.21     0.0152     0.0935
      Average                               131.21
      
      Rolled Double  Precision      131.12 Mflops 
      
      
      
      **hifi4 dsp  ,主频600M,系统 freertos**
      
      Dhrystone Benchmark, Version 2.1 (Language: C)
      
      Program compiled without 'register' attribute
      
      Execution starts, 10000000 runs through Dhrystone
      Execution ends
      
      Final values of the variables used in the benchmark:
      
      Int_Glob:            5
              should be:   5
      Bool_Glob:           1
              should be:   1
      Ch_1_Glob:           A
              should be:   A
      Ch_2_Glob:           B
              should be:   B
      Arr_1_Glob[8]:       7
              should be:   7
      Arr_2_Glob[8][7]:    10000010
              should be:   Number_Of_Runs + 10
      Ptr_Glob->
        Ptr_Comp:          933400000
              should be:   (implementation-dependent)
        Discr:             0
              should be:   0
        Enum_Comp:         2
              should be:   2
        Int_Comp:          17
              should be:   17
        Str_Comp:          DHRYSTONE PROGRAM, SOME STRING
              should be:   DHRYSTONE PROGRAM, SOME STRING
      Next_Ptr_Glob->
        Ptr_Comp:          933400000
              should be:   (implementation-dependent), same as above
        Discr:             0
              should be:   0
        Enum_Comp:         1
              should be:   1
        Int_Comp:          18
              should be:   18
        Str_Comp:          DHRYSTONE PROGRAM, SOME STRING
              should be:   DHRYSTONE PROGRAM, SOME STRING
      Int_1_Loc:           5
              should be:   5
      Int_2_Loc:           13
              should be:   13
      Int_3_Loc:           7
              should be:   7
      Enum_Loc:            1
              should be:   1
      Str_1_Loc:           DHRYSTONE PROGRAM, 1'ST STRING
              should be:   DHRYSTONE PROGRAM, 1'ST STRING
      Str_2_Loc:           DHRYSTONE PROGRAM, 2'ND STRING
              should be:   DHRYSTONE PROGRAM, 2'ND STRING
      
      Microseconds for one run through Dhrystone:    0.8 
      Dhrystones per Second:                      1243935.8 
      DMIPS:                                      707.99 
      
      Rolled Double Precision Linpack Benchmark - PC Version in 'C/C++'
      
      Compiler     xtensa hifi4 dsp xtensa-elf-gcc 10.2
      Optimisation -O2
      
      norm resid      resid           machep         x[0]-1          x[n-1]-1
         1.7    7.41628980e-14   2.22044605e-16  -1.49880108e-14  -1.89848137e-14
      
      Times are reported for matrices of order          100
      1 pass times for array with leading dimension of  201
      
            dgefa      dgesl      total     Mflops       unit      ratio
          0.07600    0.00300    0.07900       8.69     0.2301     1.4107
      
      Calculating matgen overhead
             100 times   0.43 seconds
             200 times   0.85 seconds
             400 times   1.70 seconds
             800 times   3.41 seconds
            1600 times   6.81 seconds
      Overhead for 1 matgen      0.00426 seconds
      
      Calculating matgen/dgefa passes for 5 seconds
             100 times   8.05 seconds
      Passes used         62 
      
      Times for array with leading dimension of 201
      
            dgefa      dgesl      total     Mflops       unit      ratio
          0.07621    0.00231    0.07851       8.75     0.2287     1.4021
          0.07621    0.00231    0.07851       8.75     0.2287     1.4021
          0.07621    0.00231    0.07851       8.75     0.2287     1.4021
          0.07621    0.00231    0.07851       8.75     0.2287     1.4021
          0.07621    0.00231    0.07851       8.75     0.2287     1.4021
      Average                                 8.75
      
      Calculating matgen2 overhead
      Overhead for 1 matgen      0.00426 seconds
      
      Times for array with leading dimension of 200
      
            dgefa      dgesl      total     Mflops       unit      ratio
          0.07629    0.00231    0.07860       8.74     0.2289     1.4035
          0.07629    0.00231    0.07860       8.74     0.2289     1.4035
          0.07629    0.00231    0.07860       8.74     0.2289     1.4035
          0.07627    0.00232    0.07860       8.74     0.2289     1.4035
          0.07629    0.00231    0.07860       8.74     0.2289     1.4035
      Average                                 8.74
      
      Rolled Double  Precision        8.74 Mflops 
      
      2K performance run parameters for coremark.
      CoreMark Size    : 666
      Total ticks      : 16258
      Total time (secs): 16.258000
      Iterations/Sec   : 1414.688154
      Iterations       : 23000
      Compiler version : GCC10.2.0
      Compiler flags   : -O2
      Memory location  : STACK
      seedcrc          : 0xe9f5
      [0]crclist       : 0xe714
      [0]crcmatrix     : 0x1fd7
      [0]crcstate      : 0x8e3a
      [0]crcfinal      : 0xd340
      Correct operation validated. See README.md for run and reporting rules.
      CoreMark 1.0 : 1414.688154 / GCC10.2.0 -O2 / STACK
      

      目前hifi dsp采用的自编译的gcc编译器,采用R528 SDK提供的指令集,对linpack的跑分存疑,谁有candence explore能否跑个分比对下,谢谢

      发布在 其它全志芯片讨论区
      Y
      yao0718
    • 1 / 1