Nios II Custom Instruction User Guide

ID 683242
Date 4/27/2020
Public
Document Table of Contents

7.3.2. Running and Analyzing the FPH1 Example Software

Perform the following steps to analyze the results of the software project:

  1. Build the software project. The Nios II SBT for Eclipse detects the presence of the FPH1 custom instructions at build time, and uses them for all single precision floating point arithmetic.
  2. Run the software on your Nios II target design. The program runs four tests, one each for the add, subtract, multiply, and divide operations. In each test, the program carries out the floating point operation on 1000 pairs of random operands. It executes both the FPH1 custom instruction and the equivalent software implementation. Using the performance counter component, the software compares the hardware and software execution times.

    The following program output shows the results:

    --Performance Counter Report--
    Total Time: 0.01222420 seconds  (611210 clock-cycles)
    +---------------+-----+-----------+---------------+-----------+
    | Section       |  %  | Time (sec)|  Time (clocks)|Occurrences|
    +---------------+-----+-----------+---------------+-----------+
    |FP CI ADD      | 2.29|    0.00030|          14000|       1000|
    +---------------+-----+-----------+---------------+-----------+
    |FP SW ADD      | 50.2|    0.00610|         306640|       1000|
    +---------------+-----+-----------+---------------+-----------+
    
    --Performance Counter Report--
    Total Time: 0.00987798 seconds  (493899 clock-cycles)
    +---------------+-----+-----------+---------------+-----------+
    | Section       |  %  | Time (sec)|  Time (clocks)|Occurrences|
    +---------------+-----+-----------+---------------+-----------+
    |FP CI SUBTRACT | 2.83|    0.00028|          14000|       1000|
    +---------------+-----+-----------+---------------+-----------+
    |FP SW SUBTRACT | 50.8|    0.00502|         250975|       1000|
    +---------------+-----+-----------+---------------+-----------+
    
    
    
    --Performance Counter Report--
    Total Time: 0.0110131 seconds  (550654 clock-cycles)
    +---------------+-----+-----------+---------------+-----------+
    | Section       |  %  | Time (sec)|  Time (clocks)|Occurrences|
    +---------------+-----+-----------+---------------+-----------+
    |FP CI MULTIPLY | 2.18|    0.00024|          12000|       1000|
    +---------------+-----+-----------+---------------+-----------+
    |FP SW MULTIPLY |   59|    0.00650|         325076|       1000|
    +---------------+-----+-----------+---------------+-----------+
    
    --Performance Counter Report--
    Total Time: 0.0142152 seconds  (710758 clock-cycles)
    +---------------+-----+-----------+---------------+-----------+
    | Section       |  %  | Time (sec)|  Time (clocks)|Occurrences|
    +---------------+-----+-----------+---------------+-----------+
    |FP CI DIVIDE   |  4.5|    0.00064|          32000|       1000|
    +---------------+-----+-----------+---------------+-----------+
    |FP SW DIVIDE   | 67.8|    0.00963|         481698|       1000|
    +---------------+-----+-----------+---------------+-----------+
  3. Analyze the results report for each test. In each report, the FP CI <instruction> entry lists the performance of the custom instruction, and the FP SW <instruction> entry lists the performance of the software implementation. The Time (sec) and Time (clock) columns represent the aggregate time spent executing the floating point operations, in seconds and in Nios II clock cycles. Total Time represents the duration of the test, expressed both in seconds and in Nios II clock cycles. The % column represents the time spent executing the floating point operation, as a percentage of the test total.
    Note: You might have different speed results, depending on your target hardware and on the actual values of the random operands.

    The software uses the Nios II performance counter component to collect timing information on the floating point operations. For more information, refer to the Performance Counter Core chapter in volume 5 of the Intel® Quartus® Prime Handbook.