Intel® FPGA AI Suite
Find out how Intel® FPGA AI Suite can add FPGA AI to embedded systems and datacenters.
"The ease-of-use of the Intel® FPGA AI Suite and the Intel® Distribution of OpenVINO™ toolkit enabled Stryker* to develop optimized Intel® FPGA IP for deep learning inference. The inference IP was successfully integrated into an Intel® FPGA using Intel® Quartus® Prime Software. The example designs provided with the suite enabled the team to quickly evaluate different algorithms for different image sources. Intel® FPGA AI Suite and the Intel® Distribution of OpenVINO toolkit enable data scientists and FPGA engineers to seamlessly work together to develop optimized deep learning inference for medical applications."
— Stryker Engineering Team
Overview
Intel FPGAs enable real-time, low-latency, and low-power deep learning inference combined with the following advantages:
- I/O flexibility
- Reconfiguration
- Ease of integration into custom platforms
- Long lifetime
Intel FPGA AI Suite was developed with the vision of ease-of-use of artificial intelligence (AI) inference on Intel FPGAs. The suite enables FPGA designers, machine learning engineers, and software developers to create optimized FPGA AI platforms efficiently.
Utilities in the Intel FPGA AI Suite speed up FPGA development for AI inference using familiar and popular industry frameworks such as TensorFlow* or PyTorch* and OpenVINO toolkit, while also leveraging robust and proven FPGA development flows with the Intel Quartus Prime Software.
The Intel FPGA AI Suite tool flow works with the OpenVINO toolkit, an open-source project to optimize inference on a variety of hardware architectures. The OpenVINO toolkit takes Deep Learning models from all the major Deep Learning frameworks (such as TensorFlow, PyTorch, Keras*) and optimizes them for inference on a variety of hardware architectures, including various CPUs, CPU+GPU, and FPGAs.
Find out how Intel FPGA AI Suite can add FPGA AI to embedded systems and data centers.
Key Features
High Performance
Intel® Agilex™ M-Series FPGAs can achieve a maximum theoretical performance of 38 INT8 TOPS, or 3,679 Resnet-50 frames per second at 90% FPGA utilization.
Easy System Integration
Supports integration with custom IP such as ADCs/DACs, video, and Ethernet to achieve the smallest footprint and lowest latency.
Low Total Cost of Ownership
Minimize TCO with highly scalable, customizable, fine granularity AI inference across a wide range of performance and batch sizes.
Simple and Standard Flows
Create and add AI inference IP to current or new FPGA designs with Intel Quartus Prime Software or Platform Designer.
AI Front End Support
Use your favorite AI front end such as TensorFlow, Caffe, Pytorch, MXNet, Keras, and ONNX.
OpenVINO Optimization
OpenVINO Toolkit optimizes performance and power while minimizing logic and memory footprint.
FPGA AI Inference Development Flow
The AI inference development flow is shown in Figure 1. The flow seamlessly combines a hardware and software workflow into a generic end-to-end AI workflow. The steps are as follows:
1. Model Optimizer in the OpenVINO toolkit creates intermediate representation network files (.xml) and weights and biases files (.bin).
2. Intel FPGA AI Suite compiler is used to:
- Provide estimated area or performance metrics for a given architecture file or produce an optimized architecture file. (Architecture refers to inference IP parameters such as size of PE array, precisions, activation functions, interface widths, window sizes, etc.)
- Compile network files into a .bin file with network partitions for FPGA and CPU (or both) along with weights and biases.
3. The compiled .bin file is imported by the user inference application at runtime.
- Runtime application programming interfaces (APIs) include Inference Engine API (runtime partition CPU and FPGA, schedule inference) and FPGA AI (DDR memory, FPGA hardware blocks).
- Reference designs demonstrate the basic operations of importing .bin and running inference on FPGA with supporting host CPUs (x86 and Arm* processors).
Figure 1: Intel FPGA AI Suite Development Flow
Notes:
Devices supported: Intel® Agilex™ FPGA, Intel® Cyclone® 10 GX FPGA, Intel® Arria® 10 FPGA
Tested networks and activation functions1:
- ResNet-50, MobileNet v1/v2/v3, YOLO v3, TinyYOLO v3, UNET
- ReLU, 2D Conv, BatchNorm, EltWise Mult, Fully Connected, Clamp, pReLU, SoftMax
System Level Architectures
Intel FPGA AI Suite is flexible and configurable for a variety of system-level use cases. Typical ways to incorporate the FPGA AI Suite IP into a system are listed in Figure 2. The use cases span different verticals from optimized embedded platforms ranging from applications with host CPUs (Intel® Core™ processors, Arm processors) to data center environments with Intel® Xeon® processors and also to host-less applications (or soft processors such as Nios® V processors).
Figure 2: Typical Intel FPGA AI Suite System Topologies
CPU offload
AI Accelerator
Multi-function CPU offload
AI Accelerator + Additional Hardware Function
Ingest / Inline Processing + AI
AI Accelerator + Direct Ingest and Data Streaming
Embedded SoC FPGA + AI
AI Accelerator + Direct Ingest and Data Streaming + Hardware Function +
Embedded Arm or Nios® II or Nios V Processors
Videos
Overview of Intel FPGA AI Suite
Watch this video to get familiar with the design flow for Intel FPGA AI Suite.
Intel® FPGA AI Suite Installation Demo Video
Installing Intel FPGA AI Suite is easy, watch this video for a demo of the install.
Intel® FPGA AI Suite Compilation Demo Video
Watch a quick demo of the Intel FPGA AI Suite compiling a RESNET-50 pretrained model and output inference results.
Intel FPGA AI Suite is Available Today for Pricing and Evaluation
Reference designs with pre-built FPGA design examples available for initial evaluation on Terasic DE10-Agilex Development Board and Intel Arria 10 SoC Development Kit, and for further development of custom system-level designs.