Performance Benchmarks and Configuration Details for Intel® Xeon® Scalable Processors

Notices & Disclaimers

Intel® technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Performance results are based on testing as of the dates shown in configurations and may not reflect all publicly available security updates. See configuration disclosure for details. No product or component can be absolutely secure.

Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, go to www.intel.com/benchmarks.

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel® microprocessors. These optimizations include SSE2 and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.

Microprocessor-dependent optimizations in this product are intended for use with Intel® microprocessors. Certain optimizations not specific to Intel® microarchitecture are reserved for Intel® microprocessors. Please refer to the applicable product user and reference guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804.

Cost reduction scenarios described are intended as examples of how a given Intel®-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel does not control or audit third-party benchmark data or the websites referenced in this document. You should visit the referenced website and confirm whether referenced data are accurate.

© Intel Corporation. Intel, Xeon, Optane, AVX, and DL Boost are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as property of others.

Configuration Details

1. Up to 33% Average Generational Gains (1.33x) on Intel® Xeon® Gold Processor Mainstream CPUs: Geomean of est SPECrate2017_int_base, est SPECrate2017_fp_base, STREAM-Triad, Intel® Distribution of LINPACK, server-side Java*. Gold 5218 vs Gold 5118. Baseline: 1-node, 2x Intel® Xeon® Gold 5118 processor on Wolf Pass with 384 GB (12 X 32GB 2666 (2400)) total memory, ucode 0x200004D on RHEL7.6, 3.10.0-957.el7.x86_64, IC18u2, AVX2, HT on all (off Stream, LINPACK), Turbo on, result: est int throughput=119, est fp throughput=134, STREAM-Triad=148.6, LINPACK=822, server-side Java=67434, test by Intel on 11/12/2018. New configuration: 1-node, 2x Intel® Xeon® Gold 5218 processor on Wolf Pass with 384 GB (12 X 32GB 2933 (2666)) total memory, ucode 0x4000013 on RHEL7.6, 3.10.0-957.el7.x86_64, IC18u2, AVX2, HT on all (off Stream, LINPACK), Turbo on, result: est int throughput=162, est fp throughput=172, STREAM-Triad=185, LINPACK=1088, server-side java=98333, test by Intel on 12/7/2018.

2. 2x Average Generational Gains: On 2-socket servers with 2nd Gen Intel® Xeon® Platinum 9200 processor. Geomean of est SPECrate2017_int_base, est SPECrate2017_fp_base, STREAM-Triad, Intel® Distribution of LINPACK, server-side Java*. Platinum 92xx vs. Platinum 8180. Baseline: 1-node, 2x Intel® Xeon® Platinum 8180 processor on Wolf Pass with 384 GB (12 X 32GB 2666) total memory, ucode 0x200004D on RHEL7.6, 3.10.0-957.el7.x86_64, IC19u1, AVX512, HT on all (off Stream, LINPACK), Turbo on all (off Stream, LINPACK), result: est int throughput=307, est fp throughput=251, STREAM-Triad=204, LINPACK=3238, server-side Java=165724, test by Intel on 1/29/2019. New configuration: 1-node, 2x Intel® Xeon® Platinum 9282 processor on Walker Pass with 768 GB (24x 32GB 2933) total memory, ucode 0x400000A on RHEL7.6, 3.10.0-957.el7.x86_64, IC19u1, AVX512, HT on all (off Stream, LINPACK), Turbo on all (off Stream, LINPACK), result: est int throughput=635, est fp throughput=526, STREAM-Triad=407, LINPACK=6411, server-side Java=332913, test by Intel on 2/16/2019.

3. 30x Inference Throughput Improvement on Intel® Xeon® Platinum 9282 processor with Intel® Deep Learning Boost (Intel® DL Boost): Tested by Intel as of 2/26/2019. Platform: Dragon rock 2 socket Intel® Xeon® Platinum 9282 processor (56 cores per socket), HT ON, turbo ON, Total Memory 768 GB (24 slots/ 32 GB/ 2933 MHz), BIOS:SE5C620.86B.0D.01.0241.112020180249, CentOS 7 Kernel 3.10.0-957.5.1.el7.x86_64, Deep Learning Framework: Intel® Optimization for Caffe* version: https://github.com/intel/caffe d554cbf1, ICC 2019.2.187, MKL DNN version: v0.17 (commit hash: 830a10059a018cd2634d94195140cf2d8790a75a), model: https://github.com/intel/caffe/blob/master/models/intel_optimized_models/int8/resnet50_int8_full_conv.prototxt, BS=64, No datalayer synthetic Data: 3x224x224, 56 instance/2 socket, Datatype: INT8 vs Tested by Intel as of July 11th 2017: 2S Intel® Xeon® Platinum 8180 processor CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via Intel_pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux* release 7.3.1611 (Core), Linux* kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD Data Center S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC). Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact‘, OMP_NUM_THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Caffe: (http://github.com/intel/caffe/), revision f96b759f71b2281835f690af267158b82b150b5c. Inference measured with “caffe time --forward_only” command, training measured with “caffe time” command. For “ConvNet” topologies, synthetic dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (ResNet-50). Intel® C++ Compiler ver. 17.0.2 20170213, Intel® Math Kernel Library (Intel® MKL) small libraries version 2018.0.20170425. Caffe run with “numactl -l“.

4. Up to 14x Improvement in Inference Performance on Intel® Xeon® Platinum 8280 processor with Intel® Deep Learning Boost (Intel® DL Boost): Tested by Intel as of 2/20/2019. 2 socket Intel® Xeon® Platinum 8280 processor, 28 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: SE5C620.86B.0D.01.0271.120720180605 (ucode: 0x200004d), Ubuntu 18.04.1 LTS, kernel 4.15.0-45-generic, SSD 1x sda Intel® SSD DC S3700 Series SSD 745.2GB, nvme1n1 Intel® SSD DC P4500 Series SSD 3.7TB, Deep Learning Framework: Intel® Optimization for Caffe* version: 1.1.3 (commit hash: 7010334f159da247db3fe3a9d96a3116ca06b09a), ICC version 18.0.1, MKL DNN version: v0.17 (commit hash: 830a10059a018cd2634d94195140cf2d8790a75a, model: https://github.com/intel/caffe/blob/master/models/intel_optimized_models/int8/resnet50_int8_full_conv.prototxt, BS=64, synthetic Data, 4 instance/2 socket, Datatype: INT8 vs Tested by Intel as of July 11th 2017: 2S Intel® Xeon® Platinum 8180 processor CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via Intel_pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux* release 7.3.1611 (Core), Linux* kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_AFFINITY='granularity=fine, compact‘, OMP_NUM_THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Caffe: (http://github.com/intel/caffe/), revision f96b759f71b2281835f690af267158b82b150b5c. Inference measured with “caffe time --forward_only” command, training measured with “caffe time” command. For “ConvNet” topologies, synthetic dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (ResNet-50). Intel® C++ Compiler ver. 17.0.2 20170213, Intel® Math Kernel Library (Intel® MKL) small libraries version 2018.0.20170425. Caffe run with “numactl -l“.

5. Up to 3.7x Avg Gain w/Intel® Xeon® Platinum 9242 Processor Vs 3-Year Old Server: Average geomean of STREAM, HPCG, HPL, WRF, OpenFOAM, LS-Dyna, VASP, NAMD, LAMMPS, Black Scholes, and Monte Carlo. Individual workload may vary. Intel® Xeon® E5-2697 v4 processor: Intel reference platform with 2S Intel® Xeon® E5-2697 v4 processors (2.3GHz, 18C), 8x16GB DDR4-2400, 1 SSD, Cluster File System: Panasas (124 TB storage) Firmware v6.3.3.a & OPA based IEEL Lustre, BIOS: SE5C610.86B.01.01.0027.071020182329, Microcode: 0xb00002e, Oracle Linux* Server release 7.6 (compatible with RHEL 7.6) on a 7.5 kernel using ksplice for security fixes, Kernel: 3.10.0-862.14.4.el7.crt1.x86_64, OFED stack: OFED OPA 10.8 on RH7.5 with Lustre v2.10.4, HBA: 100Gbps Intel® Omni-Path Architecture (Intel® OPA) 1 port PCIe* x16, Switch: Intel® Omni-Path Edge Switch (Intel® OP Edge Switch) 100 Series 48 Port. STREAM OMP 5.10, Triad, HT=ON, Turbo=OFF, 1 thread per corescore: 128.36. HPCG, Binary included MKL 2019u1, HT=ON, Turbo=OFF, 1 thread per corescore: 23.78. HPL 2.1, HT=ON, Turbo=OFF, 2 threads per corescore: 1204.64. WRF 3.9.1.1, conus-2.5km, HT=ON, SMT=ON, 1 thread per corescore: 4.54. OpenFOAM 6.0, 42M_cell_motorbike, HT=ON, Turbo=OFF, 1 thread per corescore: 3500. LS-Dyna 9.3-Explicit AVX2 binary, 3car, HT=ON, SMT=ON, 1 thread per corescore: 2814. VASP 5.4.4, CuC, HT=ON, Turbo=OFF, 1 thread per corescore: 384.99. NAMD 2.13, apoa1, HT=ON, Turbo=OFF, 2 threads per corescore: 4.4. LAMMPS version 12 Dec 2018, Water, HT=ON, Turbo=ON, 2 threads per corescore: 54.72. Black Scholes, HT=ON, Turbo=ON, 2 threads per corescore: 2573.77. Monte Carlo, HT=ON, Turbo=ON, 2 threads per corescore: 43.2. Intel® Xeon® 9242 processor: Intel reference platform with 2S Intel® Xeon® 9242 processors (2.2GHz, 48C), 24x16GB DDR4-2933, 1 SSD, Cluster File System: 2.12.0-1 (server) 2.11.0-14.1 (client), BIOS: PLYXCRB1.86B.0572.D02.1901180818, Microcode: 0x4000017, CentOS 7.6, Kernel: 3.10.0-957.5.1.el7.x86_64, OFED stack: OFED OPA 10.8 on RH7.5 with Lustre v2.10.4, HBA: 100Gbps Intel® Omni-Path Architecture (Intel® OPA) 1 port PCIe* x16, Switch: Intel® Omni-Path Edge Switch (Intel® OP Edge Switch) 100 Series 48 Port. STREAM OMP 5.10, Triad, HT=ON, Turbo=OFF, 1 thread per corescore: 407. HPCG, Binary included MKL 2019u1, HT=ON, Turbo=OFF, 1 thread per corescore: 81.91. HPL 2.1, HT=ON, Turbo=OFF, 2 threads per corescore: 5314. WRF 3.9.1.1, conus-2.5km, HT=ON, SMT=ON, 1 thread per corescore: 1.44. OpenFOAM 6.0, 42M_cell_motorbike, HT=ON, Turbo=OFF, 1 thread per corescore: 1106. LS-Dyna 9.3-Explicit AVX2 binary, 3car, HT=ON, SMT=ON, 1 thread per corescore: 768. VASP 5.4.4, CuC, HT=ON, Turbo=OFF, 1 thread per corescore: 133.96. NAMD 2.13, apoa1, HT=ON, Turbo=OFF, 2 threads per corescore: 19.9. LAMMPS version 12 Dec 2018, Water, HT=ON, Turbo=ON, 2 threads per corescore: 276.1. Black Scholes, HT=ON, Turbo=ON, 2 threads per corescore: 9044.32. Monte Carlo, HT=ON, Turbo=ON, 2 threads per corescore: 227.62. OpenFOAM Disclaimer: This offering is not approved or endorsed by OpenCFD Limited, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM* and OpenCFD* trademark.

6. Up to 3.5x VM Density Performance: 1-node, 2x Intel® Xeon® processor E5-2697 v2 on Canon Pass with 256 GB (16 slots / 16GB / 1600) total memory, ucode 0x42c on RHEL7.6, 3.10.0-957.el7.x86_64, 1x Intel® SSD 400GB OS drive, 2x P4500 4TB PCIe*, 2*82599 dual port Ethernet, Virtualization Benchmark, VM kernel 4.19, HT on, Turbo on, result: VM density=21, test by Intel on 1/15/2019. 1-node, 2x Intel® Xeon® Platinum 8280 processor on Wolf Pass with 768 GB (24 slots / 32GB / 2666) total memory, ucode 0x2000056 on RHEL7.6, 3.10.0-957.el7.x86_64, 1x Intel® SSD 400GB OS Drive, 2x P4500 4TB PCIe*, 2*82599 dual port Ethernet, Virtualization Benchmark, VM kernel 4.19, HT on, Turbo on, result: VM density=74, test by Intel on 1/15/2019.

7. Up to 59% Savings with Fewer Servers (1.59x) at Similar Performance Levels When Upgrading 5 Year Old Server to 2nd Gen Intel® Xeon® Scalable processor and Reduce data center footprint by replacing twenty 5-year old servers with six servers based on 2S Intel® Xeon® Platinum 8280 processor. Configuration details: 1-node, 2x Intel® Xeon® processor E5-2697 v2 on Canon Pass with 256 GB (16 slots / 16GB / 1600) total memory, ucode 0x42c on RHEL7.6, 3.10.0-957.el7.x86_64, 1x Intel® SSD 400GB OS drive, 2x P4500 4TB PCIe*, 2*82599 dual port Ethernet, Virtualization Benchmark, VM kernel 4.19, HT on, Turbo on, result: VM density=21, test by Intel on 1/15/2019. 1-node, 2x Intel® Xeon® Platinum 8280 processor on Wolf Pass with 768 GB (24 slots / 32GB / 2666) total memory, ucode 0x2000056 on RHEL7.6, 3.10.0-957.el7.x86_64, 1x Intel® SSD 400GB OS drive, 2x P4500 4TB PCIe*, 2*82599 dual port Ethernet, Virtualization Benchmark, VM kernel 4.19, HT on, Turbo on, result: VM density=74, test by Intel on 1/15/2019. Cost reduction scenarios described are intended as examples of how a given Intel®-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. Example based on estimates as of March 2019 of equivalent rack performance over 4-year operation on virtualization workload running VMware* vSphere Enterprise Plus on Red Hat Enterprise Linux* Server and comparing 20 installed 2-socket servers with Intel® Xeon® processor E5-2697 v2 (formerly “IvyBridge”) at a total cost of $796,563 [Per server cost $39.8K: acquisition=13.7K, infrastructure, and utility=4.2K, os & software=12.2K, maintenance=9.7K ] vs. 6 new Intel® Xeon® Platinum 8280 processor (costs based on Platinum 8180 assumptions) at a total cost of $325,805 [Per server cost $54.3K: acquisition=28.9K, infrastructure, and utility=3.5K, os & software=12.2K, maintenance=9.7K]. Assumptions based on https://xeonprocessoradvisor.intel.com, assumptions as of Feb 13, 2019.

8. Analytics 1 Up to 24.8x Performance Gains on Data Warehousing Queries on the new 2nd Gen Intel® Xeon® Platinum 8280 processor with Windows* Server 2016 vs. 4-year old legacy server with old hardware and software. 1-node, 2x Intel® Xeon® processor E5-2699 v3 on Wildcat Pass with 768 GB (24 slots / 32GB / 2666) total memory (workload uses 691GB), ucode 0x3D on Windows* Server 2008 R2, 1 x S710 (200GB), 1 x S3500 (1.6TB), 2 x P4608 (6.4TB), SQL Server 2008 R2 SP1 (Enterprise Edition), HT on, Turbo on, result: queries per hour at 1TB =33681, test by Intel on 12/21/2018. 1-node, 2x Intel® Xeon® Platinum 8280 processor on Wolf Pass with 1536 GB (24 slots / 64GB / 2666 (1866)) total memory (workload uses 691GB), ucode 0xA on Windows* Server 2016 (RS1 14393), 1 x S710 (200GB), 1 x S3500 (1.6TB), 4 x P4610 (7.6TB), SQL Server 2017 RTM - CU13 (Enterprise Edition), HT on, Turbo on, result: queries per hour at 1TB =836261, test by Intel on 3/13/2019.

9. Up to 4.3X Big Data SPARK* Performance on 2nd Gen Intel® Xeon® Scalable processor vs 5 year old 2-socket server. Geomean of SparkKmeans, SparkSort, SparkTerasort. 1+4-node, 2x Intel® Xeon® processor E5-2697 v2 on S2600JF with 128 GB (8 slots / 16GB / 1866) total memory, ucode 0x42d on CentOS-7.6.1810, 4.20.0-1.el7.x86_64, 1x 180GB SATA3 SSD, 3 x Seagate ST4000NM0033 (4TB), 1 x Intel I350, HiBench v7.1 / bigdata, Mllib, OpenJDK-1.8.0_191, python-2.7.5, Apache Hadoop-2.9.1, Apache Spark-2.2.2, HT on, Turbo on, result: SparkKmeans=119.5M, SparkSort=121.4M, SparkTerasort=107.4M, test by Intel on 1/23/2019. 1+4-node, 2x Intel® Xeon® Gold 6248 processor on S2600WF with 768 GB (384 GB used) (12 slots* / 64 GB / 2400 (384GB used)) total memory, ucode 0x400000A on CentOS-7.6.1810, 4.20.0-1.el7.x86_64, Intel® SSD DC S3710, 6 x Seagate ST2000NX0253 (2TB), 1 x Intel X722, HiBench v7.1 / bigdata, Mllib, OpenJDK-1.8.0_191, python-2.7.5, Apache Hadoop-2.9.1, Apache Spark-2.2.2, HT on, Turbo on, result: SparkKmeans=1235.8M, SparkSort=518.4M, SparkTerasort=589.3M, test by Intel on 1/23/2019.

10. 36% More VMS Per Node (1.36x) for Multi-Tenant Virtualized OLTP Databases with Intel® Optane™ DC Persistent Memory Module (DCPMM): 1-node, 2x 26-core 2nd Gen Intel® Xeon® Scalable Processor, HT on, Turbo on, 768GB, 0(24 slots / 32GB / 2666 DDR)1x Samsung PM963 M.2 960GB, 7 x Samsung PM963 M.2 960GB, 4x Intel® SSDs S4600 (1.92TB), 1x Intel X520 SR2 (10Gb), Windows* Server 2019 RS5-17763, OLTP Cloud Benchmark, test by Intel as of 1/31/2019. 1-node, 2x 26-core 2nd Gen Intel® Xeon® Scalable Processor, HT on, Turbo on, 192GB, 1TB(12 slots / 16 GB / 2666 DDR + 8 slots /128GB / 2666 Intel® Optane™ DCPMM), 1x Samsung PM963 M.2 960GB, 7x Samsung PM963 M.2 960GB, 4x Intel® SSDs S4600 (1.92TB), 1x Intel X520 SR2 (10Gb), Windows* Server 2019 RS5-17763, OLTP Cloud Benchmark, test by Intel as of 1/31/2019.

30% Lower Cost Per VM (1.30x) for Multi-Tenant Virtualized OLTP Databases with Intel® Optane™ DC Persistent Memory Module (DCPMM): Cost footnotes (Pricing as of 15 Mar 2019): Baseline cost: Total System Cost: $29,408 [CPU cost=$7310, Memory subsystem cost @ Capacity: 24x 64GB=$16,998, Storage cost=$2,100, Chassis, PSUs, Boot drive, etc.=$3,000] vs. New configuration cost: Total System Cost: $22,024 [CPU cost=$7310, Memory subsystem cost @ full capacity: =$9,614 ($2,690 for DDR4 + $6,924 for Intel® Optane™ DC PMEM), Storage cost=$2,100, Chassis, PSUs, Boot drive, etc.=$3,000].

11. Up to 4X More VMs When Quadrupling Memory Capacity with Intel® Optane™ DC Persistent Memory Module (DCPMM) Running Redis+Memtier: 1-node, 2x Intel® Xeon® Platinum 8280L processor on Intel reference platform with 768 GB (12 slots / 32GB / 2666) total memory, ucode 0x400000A on Fedora-27, 4.20.4-200.fc29.x86_64, 2x40GB, Redis 4.0.11, memtier_benchmark-1.2.12, KVM, 1 45GB instance/VM, centos-7.0, ww06'19 BKC, HT on, Turbo on, score VM=14, test by Intel on 2/21/2019. 1-node, 2x Intel® Xeon® Platinum 8280L processor on Intel reference platform with 192GB DDR, 3072GB Intel® Optane™ DCPMM (12 slots / 16 GB / 2666 DDR + 12 slots / 256GB/ 2666) total memory, ucode 0x400000A on Fedora-27, 4.20.4-200.fc29.x86_64, 2x40GB, Redis 4.0.11, memtier_benchmark-1.2.12, KVM, 1 45GB instance /VM, centos-7.0, ww06'19 BKC, AEP firmware 5346, HT on, Turbo on, score VM=56, test by Intel on 2/21/2019.

12. Network Specialized 2nd Gen Intel® Xeon® Scalable Processor SKUs Offer 1.25x to 1.58x Gains on Various Network Workloads VPP IP Security: Tested by Intel on 1/17/2019 1-Node, 2x Intel® Xeon® Gold 6130 processor on Neon City platform with 12x 16GB DDR4 2666MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYDCRB1.86B.0155.R08.1806130538, ucode: 0x200004d (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.15.0-42-generic, Benchmark: VPP IPSec w/AESNI (AES-GCM-128) (Max Gaits/s (1420B)), Workload version: VPP v17.10, Compiler: gcc7.3.0, Results: 179. Tested by Intel on 1/17/2019 1-Node, 2x Intel® Xeon® Gold 6230N processor on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.PFT.0569.D08.1901141837, ucode: 0x4000019 (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: VPP IPSec w/AESNI (AES-GCM-128) (Max Gbits/s (1420B)), Workload version: VPP v17.10, Compiler: gcc7.3.0, Results: 225.

VPP FIB: Tested by Intel on 1/17/2019 1-Node, 2x Intel® Xeon® Gold 6130 processor on Neon City platform with 12x 16GB DDR4 2666MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYDCRB1.86B.0155.R08.1806130538, ucode: 0x200004d (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.15.0-42-generic, Benchmark: VPP FIB (Max Mpackets/s (64B)), Workload version: VPP v17.10 in ipv4fib configuration, Compiler: gcc7.3.0, Results: 160. Tested by Intel on 1/17/2019 1-Node, 2x Intel® Xeon® Gold 6230N processor on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.PFT.0569.D08.1901141837, ucode: 0x4000019 (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: VPP FIB (Max Mpackets/s (64B)), Workload version: VPP v17.10 in ipv4fib configuration, Compiler: gcc7.3.0, Results: 212.9.

Virtual Firewall: Tested by Intel on 10/26/2018 1-Node, 2x Intel® Xeon® Gold 6130 processor on Neon City platform with 12x 16GB DDR4 2666MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 4x Intel X710-DA4, Bios: PLYDCRB1.86B.0155.R08.1806130538, ucode: 0x200004d (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.15.0-42-generic, Benchmark: Virtual Firewall (64B Mpps), Workload version: opnfv 6.2.0, Compiler: gcc7.3.0, Results: 38.9. Tested by Intel on 2/04/2019 1-Node, 2x Intel® Xeon® Gold 6230N processor on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.PFT.0569.D08.1901141837, ucode: 0x4000019 (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: Virtual Firewall (64B Mpps), Workload version: opnfv 6.2.0, Compiler: gcc7.3.0, Results: 52.3.

Virtual Broadband Network Gateway: Tested by Intel on 11/06/2018 1-Node, 2x Intel® Xeon® Gold 6130 processor on Neon City platform with 12x 16GB DDR4 2666MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYDCRB1.86B.0155.R08.1806130538, ucode: 0x200004d (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.15.0-42-generic, Benchmark: Virtual Broadband Network Gateway (88B Mpps), Workload version: DPDK v18.08 ip_pipeline application, Compiler: gcc7.3.0, Results: 56.5. Tested by Intel on 1/2/2019 1-Node, 2x Intel® Xeon® Gold 6230N processor on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.PFT.0569.D08.1901141837, ucode: 0x4000019 (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: Virtual Broadband Network Gateway (88B Mpps), Workload version: DPDK v18.08 ip_pipeline application, Compiler: gcc7.3.0, Results: 78.7.

VCMTS: Tested by Intel on 1/22/2019 1-Node, 2x Intel® Xeon® Gold 6130 processor on Supermicro*-X11DPH-Tq platform with 12x 16GB DDR4 2666MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 4x Intel XXV710-DA2, Bios: American Megatrends Inc.* version: '2.1', ucode: 0x200004d (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: Virtual Converged Cable Access Platform (iMIX Gbps), Workload version: vcmts 18.10, Compiler: gcc7.3.0, Other software: Kubernetes* 1.11, Docker* 18.06, DPDK 18.11, Results: 54.8. Tested by Intel on 1/22/2019 1-Node, 2x Intel® Xeon® Gold 6230N processor on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.PFT.0569.D08.1901141837, ucode: 0x4000019 (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: Virtual Converged Cable Access Platform (iMIX Gbps), Workload version: vcmts 18.10, Compiler: gcc7.3.0, Other software: Kubernetes* 1.11, Docker* 18.06, DPDK 18.11, Results: 83.7.

OVS DPDK: Tested by Intel on 1/21/2019. Baseline: 1-Node, 2x Intel® Xeon® Gold 6130 processor on Neon City platform with 12x 16GB DDR4 2666MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 4x Intel XXV710-DA2, Bios: PLYXCRB1.86B.0568.D10.1901032132, ucode: 0x200004d (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.15.0-42-generic, Benchmark: Open Virtual Switch (on 4C/4P/8T 64B Mpacket/s), Workload version: OVS 2.10.1, DPDK-17.11.4, Compiler: gcc7.3.0, Other software: QEMU-2.12.1, VPP v18.10, Results: 9.6. Tested by Intel on 1/18/2019 New configuration: 1-Node, 2x Intel® Xeon® Gold 6230N processor on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.86B.0568.D10.1901032132, ucode: 0x4000019 (HT= ON, Turbo= OFF), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: Open Virtual Switch (on 6P/6C/12T 64B Mpacket/s), Workload version: OVS 2.10.1, DPDK-17.11.4, Compiler: gcc7.3.0, Other software: QEMU-2.12.1, VPP v18.10, Results: 15.2. Tested by Intel on 1/18/2019 1-Node, 2x Intel® Xeon® Gold 6230N processor with SST-BF enabled on Neon City platform with 12x 16GB DDR4 2999MHz (384GB total memory), Storage: 1x 240GB Intel® SSD, Network: 6x Intel XXV710-DA2, Bios: PLYXCRB1.86B.0568.D10.1901032132, ucode: 0x4000019 (HT= ON, Turbo= ON (SST-BF)), OS: Ubuntu* 18.04 with kernel: 4.20.0-042000rc6-generic, Benchmark: Open Virtual Switch (on 6P/6C/12T 64B Mpacket/s), Workload version: OVS 2.10.1, DPDK-17.11.4, Compiler: gcc7.3.0, Other software: QEMU-2.12.1, VPP v18.10, Results: 16.9.

Configuration Details for Twitter Hadoop Animation (13-15)

13. 50% Faster Runtimes (1.5x). Configuration details: Baseline: Dual-socket Intel® Xeon® processor E5-2630 v4 @ 2.2 GHz (10 cores/20 threads per socket); 128 GB RAM; 12x 6 TB 7200 RPM SATA HDD; 1x SATA SSD boot disk; 25 GbE Ethernet; 102 nodes spread across 6 racks. Workload: Gridmix* and Terasort*. Gridmix Score: 3309 seconds; Terasort Score: 5504 seconds. New configuration: Dual-socket Intel® Xeon® processor E5-2630 v4 @ 2.2 GHz (10 cores/20 threads per socket); 128 GB RAM; 12x 6 TB 7200 RPM SATA HDD; 1x SATA SSD boot disk; 1x 750 GB Intel® Optane™ DC SSD P4800X NVMe*-based; 25 GbE Ethernet; 102 nodes spread across 6 racks. Workload: Gridmix and Terasort. Gridmix Score: 2396 seconds; Terasort Score: 2640 seconds. OS: Twitter CentOS* 6 Derivative, Kernel Version 2.6.74-t1.el6.x86_64 (based on upstream 4.14.12 Kernel), BIOS Version: D3WWM11. Microcode Version: 0xb000021. Note that the test cluster used a higher core count than Twitter’s production Hadoop* clusters, which provided only 4 cores/8 threads per HDD.

14. 30% Lower TCO (1.30x). Configuration details: Baseline: Single-socket Intel® Xeon® processor E3-1230 v6 (4 cores); 32 to 64 GB RAM; 1x 1 TB or 2 TB HDDs; Intel® SSD DC S4500 Series 240GB boot disk; 1 GbE to 10 GbE Ethernet; no caching. New configuration: Single-socket Intel® Xeon® Gold 6262 processor (24 cores); 192 GB RAM; Intel S4500 240 GB boot disk; 8x 6 TB HDDs; 1x Intel® SSD DC P4610 6.4TB; 25 GbE Ethernet; caching using Intel® Cache Acceleration Software (Intel® CAS). OS: Twitter CentOS* 6 Derivative, Kernel Version 2.6.74-t1.el6.x86_64 (based on upstream 4.14.12 Kernel), BIOS Version: D3WWM11, Microcode Version: 0xb000021.

15. Approximately 75% Lower Power Consumption (1.75x). Source: Twitter estimate, based on 4 racks (10KW each) consolidating into 1 rack (10KW).

Configuration Details for Intel® Optane™ DC Persistent Memory Use Cases (16-20)

16. 25% More Data (1.25x) to be Available in the Database Main Store and Saves 10% in Costs (1.10x): Baseline: 1-node, 4x Intel® Xeon® Platinum 8280M processor on Lightning Ridge with 48x 128GB DDR4 2666 MHz GB total memory, ucode TBD on SUSE 15, 60x Intel® SSD DC S4600 SATA 480GB TB, SAP HANA* analytic workload operating on 5.83TB database, HT OFF, Turbo OFF, test by Intel on 3/15/2019. New configuration: 1-node, 4x Intel® Xeon® Platinum 8280L processor on Lightning Ridge with total memory - (24x 128GB DRAM and 24x 256GB Intel® Optane™ DC PMEM), ucode TBD on SUSE 15, 75x Intel® SSD DC S4600 SATA 480GB, SAP HANA* analytic workload operating on 7.3TB database, HT OFF, Turbo OFF, test by Intel on 3/15/2019.
Cost footnotes (Pricing as of 15 Mar 2019): Baseline cost: Total System Cost=$171,453 [CPU cost=$52048, Memory subsystem cost @ Capacity: 48x128GB=$91,834, Storage cost=$19,968, Chassis, PSUs, Boot drive, etc.=$7603] vs. New configuration cost: Total System Cost=$152,609 [CPU cost=$52048, Memory subsystem cost @ full capacity: =$67,998 ($16,998 for DDR4 + $51,000 for Intel® Optane™ DC PMEM), Storage cost=$24,960, Chassis, PSUs, Boot drive. etc.=$7603].

17. Up to 39% Lower Total System Cost (1.39x) Per Database: Baseline: 5-node, 4x Intel® Xeon® Platinum 8280L processor on Lightning Ridge with 6TB total memory (48 slots / 128 GB / 2666), ucode TBD on SUSE 15, 60x Intel® SSD DC S4600 SATA 480GB, SAP HANA* analytic workload operating on 3TB database, HT OFF, Turbo OFF, test by Intel on 3/15/2019. New configuration: 5-node, 4x Intel® Xeon® Platinum 8280M processor on Lightning Ridge with 9TB total memory (24x 256GB Intel® Optane™ DCPMM + 24x 128GB DDR4 2666), ucode TBD on SUSE 15, 90x Intel® SSD DC S4600 SATA 480GB, SAP HANA* analytic workload operating on 6TB, HT OFF, Turbo OFF, test by Intel on 3/15/2019.
Cost footnotes (Pricing as of 15 Mar 2019): Baseline cost: Total System Cost per TB of database: $67,215 [CPU cost=$52048, Memory subsystem cost @ Capacity: 48x128GB=$91834, Storage cost=$20,160, Chassis, PSUs, Boot drive, etc.=$7603, single system cost=$201,645] vs. New configuration cost: Total System Cost per TB of database: $40,717 [CPU cost=$52048, Memory subsystem cost @ full capacity: =$96,917 ($45,917 for DDR4 + $51,000 for Intel® Optane™ DC PMEM), Storage cost=$38,156, Chassis, PSUs, Boot drive. etc.=$7603, single system cost=$244,300].

18. Up to 40% Lower Memory Cost (1.40x) for Content Delivery of High-Quality Video: Baseline: 1-node, 2x Intel® Xeon® Gold 6252 processor at 2.10 GHz on S2600WFT platform with 1.5 TB total memory (24x 64 GB @ 2666 MT/s), ucode 0x04000010 on CentOS* Linux release 7.5.1804 (Core) 4.19.0-rc3+ (Host), 4.19.0-rc3, Intel® SSD DC P4510 1TB, 2x Dual port Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02) NUMA Aligned 100Gbps LAG, Apache Traffic server 7.1.4, NGINX 1.12.2, HT on, Turbo on, Dataset = 512 X 103, 1 MB Randomized Web Page Content, test by Intel on 1/15/2019. New configuration: 1-node, 2x Intel® Xeon® Gold 6252 processor at 2.10 GHz S2600WFT platform with 192 GB + 1.5 TB total memory (12x 128 GB Intel® Optane™ DCPMM + 12x 16 GB @ 2666 MT/s DDR4), ucode 0x04000010 on CentOS* Linux release 7.5.1804 (Core) 4.19.0-rc3+ (Host), 4.19.0-rc3, Intel® SSD DC P4510 1TB, 2x Dual port Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02) NUMA Aligned 100Gbps LAG, Apache Traffic server 7.1.4, NGINX 1.12.2, HT on, Turbo on, Dataset = 512 X 103, 1 MB Randomized Web Page Content, test by Intel on 1/15/2019.
Cost footnotes (Pricing as of 15 Mar 2019): Baseline cost: Total System Cost: $29,408 [CPU cost=$7310, Memory subsystem cost @ Capacity: 24x 64GB=$16,998, Storage cost=$2,100, Chassis, PSUs, Boot drive, etc.=$3,000] vs. New configuration cost: Total System Cost: $22,024 [CPU cost=$7310, Memory subsystem cost @ full capacity: =$9,614 ($2,690 for DDR4 + $6,924 for Intel® Optane™ DC PMEM), Storage cost=$2,100, Chassis, PSUs, Boot drive, etc.=$3,000].

19. Up to 43% Lower Memory Cost (1.43x) Running SAS Machine Learning Workload: Baseline: 1-node, 2x Intel® Xeon® 8280 CPU on Purley Wolfpass (2S) with 24x 64GB DDR4 1536GB total memory, ucode 0x4000013 on CentOS 7.6, 4.19.8, 1x 1.5TB Intel® SSD DC P4610 NVMe* Drive, SAS Machine learning workload running 3 concurrent logistic regression tasks on 400GB of data each, HT on, Turbo on, elapsed time=15:39min: test by Intel on 2/14/2019. New configuration: 1-node, 2x Intel® Xeon® 8280 CPU on Purley Wolfpass (2S) with 1536GB total memory (12x 128GB Intel® Optane™ DCPMM + 12x 16GB DDR4 GB), ucode 0x4000013 on CentOS 7.6, 4.19.8, 1x 1.5TB Intel® SSD DC P4610 NVMe Drive, SAS Machine learning workload running 3 concurrent logistic regression tasks on 400GB of data each, HT on, Turbo on, result=16:06min: test by Intel on 2/15/2019.
Cost footnotes (Pricing as of 15 Mar 2019): Baseline cost: Total System Cost: $38,316 [CPU cost=$20,018, Memory subsystem cost @ Capacity: 24x 64GB=$16,998, Chassis, PSUs, Boot drive, etc.=$1,300] vs. New configuration cost: Total System Cost: $30,932 [CPU cost=$20,018, Memory subsystem cost @ full capacity: =$9,614 ($2,690 for DDR4 + $6,924 for Intel® Optane™ DC PMEM), Chassis, PSUs, Boot drive, etc.=$1,300].

20. Up to 35% More VMs (1.35x) and 27% Lower Costs (1.27x) for Hyper-Converged Infrastructures: Baseline: 4-node, 2x Intel® Xeon® Gold 6230 processor on S2600WFD platform with 384GB total memory (24x 16GB DDR4 @ 2933 MT/s), ucode 0x04000013, running on Windows* Server 2019, 10.0.17763, 2x 375GB Intel® Optane™ DC SSD P4800X, 1x Chelsio 25G NIC (iWARP), workload: vmfleet and diskspd result=41 VMs (settings: Benchmark Setup: Vmfleet Test: Each VM with 1 Core, 8 GB Memory, 40 GB VHDX, Test setup: Threads=2, Buffer Size= 4KiB, Pattern: Random, Duration = 300 Seconds, Queue Depth=16, 30% write, OS: Windows* Server 2019 Standard (Desktop) with updated patch), HT On, Turbo On, test by Microsoft on 2/15/2019. New configuration: 4-node, 2x Intel® Xeon® Gold 6230 processor on S2600WFD platform with 512GB total memory (12x 16GB DDR4 + 4x 128GB Intel® Optane™ DCPMM), ucode 0x04000013, running on Windows* Server 2019, 10.0.17763, 2x 375GB Intel® Optane™ DC SSD P4800X, 1x Chelsio 25G NIC (iWARP), workload: vmfleet and diskspd result=56 VMs (settings: Benchmark Setup: Vmfleet Test: Each VM with 1 Core, 8 GB Memory, 40 GB VHDX, Test setup: Threads=2, Buffer Size= 4KiB, Pattern: Random, Duration = 300 Seconds, Queue Depth=16, 30% write, OS: Windows* Server 2019 Standard (Desktop) with updated patch), HT On, Turbo On, test by Microsoft on 2/15/2019.
Cost footnotes (Pricing as of 15 Mar 2019):
Baseline cost: Total System Cost (4 nodes): $136,882 [CPU cost=$3,788, Memory subsystem cost @ Capacity: 24x 16GB=$5,379, Storage=$8,338, Chassis, PSUs, Boot drive, etc.=$1,300, total system cost=$34,220] vs. New configuration cost: Total System Cost: (4 nodes): $135,355 [CPU cost=$3,788, Memory subsystem cost @ full capacity=$4,998 ($2,690 for DDR4 + $2,308 for Intel® Optane™ DC PMEM), Storage=$8,338, Chassis, PSUs, Boot drive, etc.=$1,300, total system cost=$33,839].

Configuration Details for Data Centric Innovation Day Demos (21-26)

21. SPEC CPU2017* Floating Point Rate World Record on Intel® Xeon® Platinum 9282 Processor: 1-Node, 2x Intel® Xeon® Platinum 9282 processor with 768GB (24 x 32GB 2933MT/s) total memory. CentOS 7.6.1810 with kernel 4.20.0+, Version 19.0.1.144 of Intel C/C++ Compiler. IMC Interleaving set to 1-way Interleave, Sub_NUMA Cluster set to Enabled. Source: https://spec.org/cpu2017/results/res2019q2/cpu2017-20190513-13797.pdf, SPECrate2017_fp_base score: 522. Tested by Intel as of 3/2019.

22. Up to 2.41x Performance Advantage Over Nvidia* V100 GPUs: 2 socket Intel® Xeon® Platinum 8268 processor, 24 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: SE5C620.86B.0D.01.0286.011120190816 (ucode: 0x4000013), CentOS 7.6, Kernel 4.19.5-1.el7.elrepo.x86_64, SSD 1x Intel® SSD D3-S4610 Series 960GB, Deep Learning Framework: MXNet https://github.com/apache/incubator-mxnet.git commit f1de8e51999ce3acaa95538d21a91fe43a0286ec applying https://github.com/intel/optimized-models/blob/v1.0.2/mxnet/wide_deep_criteo/patch/patch.diff, Compiler: gcc 6.3.1, MKL DNN version: commit: 08bd90cca77683dd5d1c98068cea8b92ed05784, Wide & Deep: https://github.com/intel/optimized-models/tree/v1.0.2/mxnet/wide_deep_criteo commit: c3e7cbde4209c3657ecb6c9a142f71c3672654a5, Dataset: Criteo Display Advertisement Challenge, Batch Size=512, 2 instance/2 socket, Datatype: FP32; with recommendation results: 678,000 records /seconds. vs. host system: 2 socket Intel® Xeon® Platinum 8180 processor (28 cores), HT ON, Total memory 128 GB (16 slots/8 GB/ 2666 MHz), Ubuntu* 18.04.2 LTS Accelerator: Nvidia* Turing V100 GPU accelerator, 32GB HBM2, 32GB/sec Interconnect BW, System interface x16 PCIe* Gen3, Driver Version 410.78, CUDA Version 10.0.130, CUDNN Version 7.5, CUDA CUBLAS 10.0.130 Deep learning workload: MxNet 1.4.0 https://pypi.org/project/mxnet-cu92/, DatatType:FP32, Batch Size= 512, Running 2 instances Model: Wide & Deep: https://github.com/intel/optimized-models/blob/master/mxnet/wide_deep_criteo/model.py Commit ID for the current state is c3e7cbde4209c3657ecb6c9a142f71c3672654a5 Training dataset (8,000,000 samples): wget https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv Evaluation dataset (2,000,000 samples): wget https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv python3 inference.py --batch-size $bs --num-batches 10000 >> $outdir/bs$bs- $runid.2xbgout 2>&1 & python3 inference.py --batch-size $bs --num-batches 10000 >> $outdir/bs$bs-$runid.2xfgout 2>&1. Recommendation results: 281,211 records/second. Tested by Intel as of 3/26/2019.

23. Up to 90% Average Generational Gains (1.90x) on Intel® Xeon® Gold Processor Mainstream CPUs: 2 socket Intel® Xeon® Platinum 8268 processor, 24 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: SE5C620.86B.0D.01.0286.011120190816 (ucode:0x4000013), CentOS 7.6, Kernel 4.19.5-1.el7.elrepo.x86_64, SSD 1x Intel® SSD D3-S4610 Series 960GB, Deep Learning Framework: MXNet https://github.com/apache/incubator-mxnet.git commit f1de8e51999ce3acaa95538d21a91fe43a0286ec applying https://github.com/intel/optimized-models/blob/v1.0.2/mxnet/wide_deep_criteo/patch/patch.diff, Compiler: gcc 6.3.1, MKL DNN version: commit: 08bd90cca77683dd5d1c98068cea8b92ed05784, Wide & Deep: https://github.com/intel/optimized-models/tree/v1.0.2/mxnet/wide_deep_criteo commit: c3e7cbde4209c3657ecb6c9a142f71c3672654a5, Dataset: Criteo Display Advertisement Challenge, Batch Size=512, 2 instance/2 socket, Datatype: Int8, with 1,299,000 records/ second vs. processor, 24 cores HT On Turbo ON Total Memory 384 GB (12 slots/ 32GB/ 2933 MHz), BIOS: SE5C620.86B.0D.01.0286.011120190816 (ucode: 0x4000013), CentOS 7.6, Kernel 4.19.5-1.el7.elrepo.x86_64, SSD 1x Intel® SSD D3-S4610 Series 960GB, Deep Learning Framework: MXNet https://github.com/apache/incubator-mxnet.git commit f1de8e51999ce3acaa95538d21a91fe43a0286ec applying https://github.com/intel/optimized-models/blob/v1.0.2/mxnet/wide_deep_criteo/patch/patch.diff, Compiler: gcc 6.3.1, MKL DNN version: commit: 08bd90cca77683dd5d1c98068cea8b92ed05784, Wide & Deep: https://github.com/intel/optimized-models/tree/v1.0.2/mxnet/wide_deep_criteo commit: c3e7cbde4209c3657ecb6c9a142f71c3672654a5, Dataset: Criteo Display Advertisement Challenge, Batch Size=512, 2 instance/2 socket, Datatype: FP32, with 678,000 records/ second. Tested by Intel as of 3/26/2019.

24. Up to 76% Average Gains (1.76x) on Intel® Xeon® Gold Processor 6230N CPU Over Intel® Xeon® Gold Processor 6130 CPU: Packet switching with Open vSwitch V2.10.1 and Intel optimized DPDK 17.11.4. Baseline: 1-node, 1x Intel® Xeon® Gold processor 6130 CPU on Wolf Pass with 192 GB (12 X 16GB 2666) total memory, 2x Intel Corporation Ethernet Controller X710-DA2, Ubuntu 18.04, 4.15.0-33, Bios PLYXCRB1.86B.0532.D14.1804240330, 64B MPackets, 4P/4C/8T 500000 flows per port w/ 2VM's, result: 9.6 Mpps. New configuration: 1-node, 1x Intel® Xeon® Gold processor 6230N CPU on Wolf Pass with 192 GB (12 X 16GB 2666) total memory, 3x Intel Corporation Ethernet Controller X710-DA2, Ubuntu 18.04, 4.20.0-042000rc6, Bios PLYXCRB1.86B.0568.D10.1901032132, 64B MPackets, 6P/6C/12T 500,000 flows per port w/ 3 VM's, SST-BF result: 16.9 Mpps.

25. Delivering Additional Memory at the Edge: Compares two systems containing 2x 2nd Generation Intel® Xeon® Scalable Gold 6252 CPU @ 24 core, 2.10GHz. System A: 1536GB (6x256GB AEP + 6x16GB DRAM, 2-2-2, Memory Mode, 16:1) System B: 768GB (6x128GB) 2666 DRAM. Configured CDN software: Qwilt* Instance software with cache on ramdisk. Testing performed March 2019.

26. Up to 5.5x Average Generational Gains on Intel® Xeon® Gold Processor Mainstream CPUs: Baseline: Intel® Xeon® Scalable processor Platinum 8280, configured with 192 GB of memory, Intel® Solid State Drive Data Center 480 GB, and CentOS Linux* 7.4.1708; Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN), OpenVINO™ toolkit (R5 Release), Siemens Healthineers custom topology, and dataset, Datatype: Int8, inference result: 201/seconds vs. New configuration: Intel® Xeon® Scalable processor Platinum 8180, configured with 192 GB of memory, Intel® Solid State Drive Data Center 480 GB, and CentOS Linux* 7.4.1708; Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN), OpenVINO™ toolkit (R5 Release), Siemens Healthineers custom topology, and dataset, Datatype: FP32, inference result: 37/seconds. Tested by Siemens in March 2019.

27. Up to 8x Speedup Improvement in IO Intensive Queries with Intel® Optane™ DC Persistent Memory + HDD vs. DRAM + HDDs: Baseline: 1-node, 2x Intel® Xeon® Platinum 8280L @ 2.70GHz processor on S2600WF (Wolf Pass) with 768GB DDRGB (DDR Mem: 24 slots / 32GB / 2666 MT/s) total memory, ucode 0x0400000a running Fedora release 29 kernel Linux-4.18.8-100.fc27.x86_64-x86_64-with-fedora-27,and 9 decision support I/O intensive queries, storage is 8x HDD (ST1000NX0313), 10-Gigabit SFI/SFP+ network connection. Source: score: geomean baseline 1. Tested by Intel, on 24 Feb 2019, vs. New configuration: 1-node, 2x Intel® Xeon® Platinum 8280L @ 2.70GHz processor on S2600WF (Wolf Pass) with 192GB DDR + 1TB Intel® Optane™ DC persistent memory (DDR Mem: 12 slots / 16GB / 2666 MT/s + 8 slots / 128GB / 2666 MT/s) total memory, ucode 0x0400000a running Fedora release 29 kernel Linux-4.18.8-100.fc27.x86_64-x86_64-with-fedora-27,and 9 decision support I/O intensive queries, storage is 8x HDD (ST1000NX0313), 10-Gigabit SFI/SFP+ network connection. Score: Geomean speedup 8x. Tested by Intel, on 24 Feb 2019.

28. Up to 96% Efficiency on Redis* Memtier Benchmark Running Inside Intel® Software Guard Extensions (Intel® SGX): Enclave on Intel® SGX Card. 1 node, 2x Intel® Xeon® Platinum 8180 processor with 128GB total memory running Centos 7.6 3.10.0-514.el7.centos.2.1.13.VCA.x86_64, Redis Memtier executing with 50,000 keys over 50 clients & 64 byte objects. Baseline: Unmodified Redis 5.0-rc6 without SGX enclave. Score: 55062.81 ops/sec, vs. New configuration: One Intel® SGX Card with three Intel® Xeon® E3-1585L v5 processors, 16GB memory per node, EEPROM 2.3.26, BIOS 2.3.26, SGX enabled with GPU-APERTURE set to 256. All three nodes running Ubuntu_16.04.3_2.3.26 kernel 4.14.20-1.2.3.26.vca, modified Redis SGX 5.0-rc6 with Linux SGX driver 2.0 inside SGX enclave. Score: 53239.08 ops/sec.

29. LAMMPS Protein Models Runs 51% Faster on 2-Socket Intel® Xeon® Platinum 8260 Processors: Baseline: AMD* EPYC* 7601 Processor: Supermicro* AS -1023US-TR4, 2S AMD EPYC 7601 (2.2GHz, 32C), 16x16GB DDR4-2666, 1 SSD, BIOS ver: 1.1c (10/04/2018), Microcode ver: 0x8001227, Oracle* Linux Server release 7.6 (compatible with Red Hat Enterprise Linux* (RHEL) 7.6) on a 7.5 kernel using ksplice for security fixes, Kernel: 3.10.0-957.5.1.el7.crt1.x86_64, Cluster File System: Panasas (124 TB storage) Firmware v6.3.3.a & EDR based IEEL Lustre*, HBA: 100Gbps Mellanox* EDR MT27700, 36 Port Mellanox EDR IB Switch, OFED stack: OFED MLNX mlnx-4.3-3.0.2.0, LAMMPS version 12 Dec 2018, Protein workload, Intel® Compiler 2019u2, Intel® MPI Benchmarks 2019u2, SMT=ON, Turbo=ON. Score: 1 node= 15.813 timestamp/sec, 16 node= 95.72 timestamp/sec, higher is better, tested by Intel, March 7, 2019. New configuration: Intel Reference Platform with 2x Intel® Xeon® Platinum 8260 processor (2.4GHz, 24C), 12x16GB DDR4-2933, 1 SSD, Cluster File System: Panasas (124 TB storage) Firmware v6.3.3.a & Intel® Omni-Path Architecture (Intel® OPA) based IEEL Lustre*, BIOS: SE5C620.86B.0D.01.0286.011120190816, Microcode: 0x4000013, Oracle* Linux Server release 7.6 (compatible with Red Hat Enterprise Linux* (RHEL) Server 7.6) on a 7.5 kernel using ksplice for security fixes, Kernel: 3.10.0-957.5.1.el7.crt1.x86_64, OFED stack: OFED Intel® OPA 10.9 on Oracle* Linux 7.6 (Compatible w/RHEL 7.6) w/Lustre v2.10.6, HBA: 100Gbps Intel® OPA 1 port PCIe x16, Switch: Intel® Omni-Path Edge Switch (Intel® OP Edge Switch) 100 Series 48 Port, LAMMPS version 12 Dec 2018, Protein workload, Intel® Compiler 2019u2, Intel® MPI Benchmarks 2019u2, HT=ON, Turbo=ON. Score: 1 node=24.015 timestamp/sec, 16 node=226.691 timestamp/sec, higher is better, tested by Intel, March 5, 2019.

30. When Running Molecular Dynamics HPC Applications (NAMD Workloads), Intel’s 2-Socket Intel® Xeon® Platinum 9282 Processor Delivers Up to 23% Higher Performance than a 2-Socket AMD* Rome CPU: 2S Intel® Xeon® Platinum 9282 processor configuration: Intel Reference Platform with 2x Intel® Xeon® Platinum 9282 processor (2.6GHz, 56C), 24x32GB DDR4-2933, 1 SSD, BIOS: SE5C620.86B.0D.01.0541.052120190651, Microcode: 0x4000024, Red Hat Enterprise Linux* (RHEL) Server release 7.6, Kernel: 3.10.0-957.12.2.el7.x86_64, Intel® Compiler 2019, Intel® MPI Benchmarks 2019, HT=ON, Turbo=ON, 1 thread per core, NAMD ver 2.13 Intel Optimized Build, apoa1 workload, FFTW 3.3.8, Charm++ 6.8.2, Tcl 8.6.8.
Score: 24.16ns/day, tested by Intel on May 29, 2019. 2S Intel® Xeon® Platinum 9242 processor configuration: Intel Reference Platform with 2x Intel® Xeon® Platinum 9242 processor (2.3GHz, 48C), 24x16GB DDR4-2933, 1 SSD, BIOS: PLYXCRB1.86B.0572.D02.1901180818, Microcode: 0x4000017, CentOS 7.6, Kernel: 3.10.0-957.5.1.el7.x86_64, Intel® Compiler 2019, Intel® MPI Benchmarks 2019, HT=ON, Turbo=OFF, 2 threads per core, NAMD ver 2.13 Intel Optimized Build, apoa1 workload, FFTW 3.3.8, Charm++ 6.8.2, Tcl 8.6.8. Score: 19.9ns/day, tested by Intel on February 28, 2019.
2S Intel® Xeon® Platinum 8280 processor configuration: Intel Reference Platform with 2x Intel® Xeon® Platinum 8280 processors (2.7GHz, 48C), 12x16GB DDR4-2933, 1 SSD, BIOS: SE5C620.86B.0D.01.0286.011120190816, Microcode: 0x4000013, Oracle* Linux Server release 7.6 (compatible with RHEL 7.6) on a 7.5 kernel using ksplice for security fixes, Kernel: 3.10.0-957.5.1.el7.crt1.x86_64, Intel® Compiler 2019, Intel® MPI Benchmarks 2019, HT=ON, Turbo=ON, 2 threads per core, NAMD ver 2.13 Intel Optimized Build, apoa1 workload, FFTW 3.3.8, Charm++ 6.8.2, Tcl 8.6.8. Score: 12.65ns/day, tested by Intel on May 29, 2019.
Intel build notes:
FLOATOPTS = -xCORE-AVX512 -qopt-zmm-usage=high -O3 -g -fp-model fast=2 -no-prec-div -qoverride-limits -DNAMD_DISABLE_SSE
CXX = icpc -std=c++11 -DNAMD_KNL
CXXOPTS = -static-intel -O2 $(FLOATOPTS)
CXXNOALIASOPTS = -O3 -fno-alias $(FLOATOPTS) -qopt-report-phase=loop,vec -qopt-report=4
CXXCOLVAROPTS = -O2 -ip
CC = icc
COPTS = -static-intel -O2 $(FLOATOPTS)
./config Linux-KNL-icc --charm-base $base_charm --charm-arch mpi-linux-x86_64-ifort-smp-mpicxx --with-fftw3 --fftw-prefix $base/fftw3_icc19.1_SKX --tcl-prefix $base/tcl --charm-opts –verbose
No CPU performance changes between NAMD v2.13 versus v2.12. NAMD v2.13 added changes for GPU.
Configurations for AMD* Computex claims unknown. Score: 19.6ns/day (Source: Lisa Su Computex* keynote May 27, 2019).

31. 31% Higher Performance with 2S Intel Xeon-AP vs 2S AMD* EPYC* “Rome” 7742: Intel measured as of October 8, 2019 using geomean of STREAM Triad, HPCG, HPL, WRF (2 workloads), OpenFOAM 42M_cell_motorbike, ANSYS® (14 workloads), LS-DYNA (3 workloads), VASP (4 workloads), NAMD (2 workloads), GROMACS (9 workloads), LAMMPS (9 workloads), FSI Kernels (3 workloads).

Intel® Xeon® Platinum 9282 processor configuration: Intel “Walker Pass” S9200WKL platform with 2-socket Intel® Xeon® Platinum 9282 processors (2.6GHz, 56C), 24x16GB DDR4-2933, 1 SSD, BIOS: SE5C620.86B.2X.01.0053, Microcode: 0x5000029, Red Hat Enterprise Linux* 7.7, kernel 3.10.0-1062.1.1.

AMD EPYC™ 7742 processor configuration: Supermicro AS-2023-TR4 (HD11DSU-iN) with 2-socket AMD EPYC™ 7742 “Rome” processors (2.25GHz, 64C), 16x32GB DDR4-3200, 1 SSD, BIOS: 2.0 CPLD 02.B1.01, Microcode: 830101C, CentOS* Linux release 7.7.1908, kernel 3.10.0-1062.1.1.el7.crt1.x86_64.

STREAM OMP 5.1 Triad: Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u5, BIOS: HT ON, Turbo ON, SNC ON, 1 thread/core; AMD EPYC™ 7242: Intel® Compiler 2019u5, BIOS: SMT ON, Boost ON, NPS 4, 1 thread/core.

HPCG Intel optimized version: Intel® Xeon® Platinum 9282 Processor: Intel® Compiler 2019u4, Intel® Math Kernel Library (Intel® MKL) 2019u4, Intel MPI 2019u4, BIOS: HT ON, Turbo OFF, SNC OFF, 1 thread/core; AMD EPYC™ 7742: Intel® Compiler 2019u4, Intel® MKL 2019u4, Intel MPI 2019u4, BIOS: SMT ON, Boost ON OFF, NPS 4, 1 thread/core.

HPL v2.3: Intel® Xeon® Platinum 9282 processor: Intel Optimized LINPACK Benchmark, Intel® Distribution for LINPACK* Benchmark, Compiler: Intel MPI 2018u1N=80000, NB=384, P=2, Q=1, BIOS: HT ON, Turbo ON, SNC OFF, 1 thread/core; AMD EPYC™ 7742: AMD official HPL binary https://developer.amd.com/amd-aocl/blas-library/, Compiler: Netlib HPL + BLIS, OpenMPI3 N=16000, NB=192, P=2, Q=4; BIOS: SMT ON, Boost ON, NPS 4, 1 thread/core.

WRF 3.9.1.1: Geomean (2 workloads: conus-12km, conus-2.5km): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2018u3, Intel MPI 2018u3, AVX2 build, BIOS: HT ON, Turbo ON, SNC OFF, 1 thread/core; AMD EPYC™ 7742: Intel® Compiler 2018u3, Intel MPI 2018u3, AVX2 build, BIOS: SMT ON, Boost ON, NPS 4, 1 thread/core.

OpenFOAM v6.0 42M_cell_motorbike: Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u3, Intel MPI 2019u3, BIOS: HT ON, Turbo ON, SNC OFF, 1 thread/core; AMD EPYC™ 7742: Intel® Compiler 2019u3, Intel MPI 2019u3, BIOS: SMT ON, Boost ON, NPS 4, 1 thread/core.

ANSYS® Fluent® 2019R1: Geomean (14 workloads: aircraft_wing_14m, aircraft_wing_2m, combustor_12m, combustor_16m, combustor_71m, exhaust_system_33m, f1_racecar_140m, fluidized_bed_2m, ice_2m, landing_gear_15m, oil_rig_7m, pump_2m, rotor_3m, sedan_4m): Intel® Xeon® Platinum 9282 Processor: Intel® Compiler 2017u3, Intel MPI 2018u3, BIOS: HT ON, Turbo ON, SNC ON, 1 thread/core; AMD EPYC™ 7742: Intel® Compiler 2017u3, Intel MPI 2018u3, BIOS: SMT ON, Boost ON, NPS 4, 1 thread/core.

LS-DYNA v9.3: Geomean (3 workloads: 3cars/150ms, car2car/120ms, ODB_10M/30ms): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2016u3, Intel MPI 2018u3, AVX2 build, BIOS: HT OFF, Turbo ON, SNC ON, 1 thread per core; AMD EPYC™ 7742: Intel® Compiler 2016u3, Intel MPI 2018u3, AVX2 build, BIOS: SMT OFF, Boost ON, NPS 4, 1 thread/core.

VASP, developer branch based on v5.4.4: Geomean (4 workloads: CuC, PdO4, PdO4_K221, Si): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u4, Intel® Math Kernel Library (Intel® MKL) 2019u4, Intel MPI 2019u4, BIOS: HT ON, Turbo OFF, SNC OFF, 1 thread per core; AMD EPYC™ 7742: Intel® Compiler 2019u4, Intel® MKL 2019u4, Intel MPI 2019u4, BIOS: SMT ON, Boost ON, NPS 4, 1 thread per core.

NAMD v2.13: Geomean (2 workloads: Apoa1, STMV): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u4, Intel MPI 2019u4, BIOS: HT ON, Turbo ON, SNC OFF, 2 threads per core; AMD EPYC™ 7742: Compiler: AOCC 2.0, Intel MPI 2019u4, BIOS: SMT ON, Boost ON, NPS 4, 2 threads/core.

GROMACS 2019.4: Geomean (5 workloads: archer2_small, ion_channel_pme, lignocellulose_rf, water_pme, water_rf): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u4, Intel® Math Kernel Library (Intel® MKL) 2019u4, Intel MPI 2019u4, AVX-512 build, BIOS: HT ON, Turbo OFF, SNC OFF, 2 threads per core for: ion_channel_pme, lignocellulose_rf, water_rf. 1 thread per core for: water_pme, archer2_small; AMD EPYC™ 7742: Intel® Compiler 2019u4, Intel® MKL 2019u4, Intel MPI 2019u4, AVX2_256 build, BIOS: SMT ON, Boost ON, NPS 4, 2 threads per core.

LAMMPS v2019: Geomean (9 workloads: Atomic Fluid, Copper, DPD, Liquid Crystal, Polyethylene, Protein, Stillinger-Weber, Tersoff, Water): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u5, BIOS: HT ON, Turbo ON, SNC ON, 2 threads/core; AMD EPYC™ 7742: Compiler: AOCC 2.0, Intel MPI 2019u5, BIOS: SMT ON, Boost ON, NPS 4, 2 threads/core.

FSI Kernels v2.0: Geomean (3 workloads: Binomial Options, Black Scholes, Monte Carlo): Intel® Xeon® Platinum 9282 processor: Intel® Compiler 2019u5, Intel® Math Kernel Library (Intel® MKL) 2019u5, BIOS: HT ON, Turbo ON, SNC OFF, 2 threads/core, HT OFF, Turbo ON, SNC OFF, 1 threads/core, HT ON, Turbo ON, SNC OFF, 2 threads/core; AMD EPYC™ 7742: Intel® Compiler 2019u5, Intel® MKL 2019u5, BIOS: SMT ON, Boost ON, NPS 4, 2 threads/core, SMT OFF, Boost ON, NPS 4, 1 thread/core, SMT ON, Boost ON, NPS 4, 2 threads/core.

32. Recommender Engines Up to 2.2X Faster on 2-Socket Intel® Xeon® Platinum 8280 Processors with Intel® Deep Learning Boost (Intel® DL Boost) Baseline: Supermicro AS-2023-TR4 (HD11DSU-iN) with 2-socket AMD EPYC™ 7742 “Rome” processors (2.25GHz, 64C), 32x 32GB DDR4-3200, BIOS: 2.0 CPLD 02.B1.01, Microcode: 0x830101C, running Ubuntu Linux release 19.10, kernel 5.3.0-rc3-custom, gcc version 9.2.0 Software: OOB TF build with Eigen (pip install tensorflow), Wide & Deep: https://github.com/IntelAI/models/tree/master/benchmarks/recommendation/tensorflow/wide_deep_large_ds, commit id: 4ead44aa254a84109ac8019f5d386e3adb75ac26, Model: https://storage.googleapis.com/intel-optimized-tensorflow/models/wide_deep_int8_pretrained_model.pb, https://storage.googleapis.com/intel-optimized-tensorflow/models/wide_deep_fp32_pretrained_model.pb, Dataset: Criteo Display Advertisement Challenge, Batch Size=512, 2instance/2socket, Datatype: FP32 vs New configuration: Intel reference platform “WolfPass” with 2x Intel® Xeon® Platinum 8280L processor (2.7GHz, 28C), 12x 32GB DDR4-2933, 1 SSD, BIOS: SE5C620.86B.02.01.0008.031920191559, Microcode: 0x400001c, running Ubuntu* Linux* release 19.10, kernel 5.3.0-rc3-custom, gcc version 9.2.0. Software: TensorFlow public docker: docker.io/intelaipg/intel-optimized-tensorflow:nightly-latestprs-bdw (https://github.com/tensorflow/tensorflow.git A3262818d9d8f9f630f04df23033032d39a7a413 + Pull Request PR26169 + Pull Request PR26261 + Pull Request PR26271), MKL DNN version: v0.18, Wide & Deep: https://github.com/IntelAI/models/tree/master/benchmarks/recommendation/tensorflow/wide_deep_large_ds, commit id: 4ead44aa254a84109ac8019f5d386e3adb75ac26, Model: https://storage.googleapis.com/intel-optimized-tensorflow/models/wide_deep_int8_pretrained_model.pb, https://storage.googleapis.com/intel-optimized-tensorflow/models/wide_deep_fp32_pretrained_model.pb, Dataset: Criteo Display Advertisement Challenge, Batch Size=512, 2instance/2socket, Datatype: INT8. Intel measured as of November 14, 2019.

33. 9x Higher Inference Performance with Intel® Xeon® Platinum 9282 processor. New Configuration: Tested by Intel as of 11/13/2019. 2 socket Intel® Xeon® Platinum 9282 processors (56C), HT ON, Turbo ON, Total Memory 384 GB (24 slots, 16GB, 2934Mhz), BIOS: SE5C620.86B.2X.01.0053.081920190637, Microcode: 0x500002c, Ubuntu 19.10, Kernel 5.3.0-22-generic, SSD 1x Micron_5100_MTFDDAV480TBY 447G, Intel® Deep Learning Framework: PyTorch (master + PR for MLPerf)*git fetch origin pull/25235/head:mlperf; git checkout mlperf, Compiler GCC 9.2.1.20191008, MobileNetV1, Batch Size=64, Iterations: 1000, Datatype: INT8 vs Baseline: AMD EPYC™ 7742 processor configuration: Tested by Intel as of 11/13/2019. 2-socket AMD EPYC™ 7742 “Rome” processors (64C), HT ON, Turbo ON, Total Memory 512 GB (16 slots, 32GB, 3200Mhz), BIOS: 2.0, Microcode 0x830101C, Ubuntu 19.10, Kernel 5.3.0-22-generic, SSD 1x Intel® SSD D3-S4610 1.8T, Deep Learning Framework: PyTorch (master + PR for MLPerf) *git fetch origin pull/25235/head:mlperf; git checkout mlperf, GCC 9.2.1.20191008, MobileNetV1, Batch Size=64, Iterations: 1000, Datatype: FP32.

34. Supports Up to 24 Streams in Parallel Using Visual Compute Accelerator Card for Analytics: Intel Reference Platform “WolfPass” with 2x Intel® Xeon® Gold 6252 processor (2.3GHz, 24C), 12x 16GB DDR4-2666, 2x 480 GB Intel® SSD SATA (for OS and primary data), 1x Intel® C627 chipset with Intel® QuickAssist Technology (Intel® QAT), 1x Dual Port 25GbE Intel® Ethernet Network Adapter XXV710 SFP28, running CentOS 7.3, kernel 5.1.3-1.el7.elrepo.x86_64, Bios: SE5C620.86B.0D.01.0438. 1x high density Intel® Visual Compute Accelerator (Intel® VCA) for Analytics running Ubuntu18.01.1, kernel 3.10.0-693.17.1.el7.2.5.20.VCA.x86_64. Software workloads: Host system (Libvirt-4.10.0, QEMU-4.0.0, SST-3.0.1054, CollectD-5.8.1.git-master-090afcd, DPDK-19.05, i40evf-3.2.3-k, ixgbe-5.1.0-k, ixgbevf-4.1.0-k, Docker-19.03.1 build 74b1e89, IPR-clx_Media_Analytics_R2_CI_84, Gstreamer-GST 1.16 Package clx_1.2, FFMPEG-FFMPEG4.1.0), Intel VCA-A card (Docker-19.08.1 build 75c2e88, IPR-vca_disk48_reference_k4.19_ubuntu16.04_1.0.51_00_ISS_release, Gstreamer-GST 1.16 Package vcaa_1.2.1, FFMPEG-FFMPEG4.2). BIOS Options enabled: Intel® Virtualization Technology (Intel® VT) with Intel CPU VMX Support and Intel IO Virtualization; Intel Boot Guard; Intel® Trusted Execution Technology (Intel® TXT). Input video: AVC 1080p30 at 6 Mbps. All data points are using the IPR software framework except for Object Detection on the Plus configuration. Inference Models are available from the Open Model Zoo project at https://github.com/opencv/open_model_zoo.git, checkout 2019_R2; Object Detection: Mobilenet-SSD; Face Recognition: vehicle-detection-adas-0002 and vehicle-attributes-recognition-barrier-0039; Car Classification: face-detection-adas-0001 and face-reidentification-retail-0095.