Hewlett Packard Enterprise’s Bill Mannel explores how as big data continues to explode, data centers are benefitting from a relatively new type of offload accelerator: FPGAs.
Bill Mannel, Vice President & General Manager – High Performance Computing & Artificial Intelligence, Hewlett Packard Enterprise Data centers have traditionally been dominated with CPU-based computing.
However, with the advent of next generation workloads, such as Big Data and streaming analytics, Artificial Intelligence (AI), Internet of Things (IoT), genomics, and network security, CPUs are seeing different data types, mixtures of file sizes, and new algorithms with different processing requirements.
Although server CPUs continue to execute applications faster with each new processor generation, the amount of data that needs processing is increasing explosively.
The latest Cisco Visual Networking Index, dated February 27, 2019, forecasts that global IP traffic “will grow at a Compound Annual Growth Rate (CAGR) of 26 percent from 2017 to 2022,” which would result in an annual IP traffic rate of 4.
8 Zettabytes per year (396 Exabytes per month).
As a result of this constantly increasing data load—coupled with new workloads, data center operators will need to add more and more servers to meet the computational demands for new services and for processing ever larger amounts of data.
When Data Centers Reach Their Limits At some point, every data center reaches concrete limits in terms of electrical power capacity, cooling capability, and rack space.
Consequently, data centers are increasingly turning to accelerators to offload some of the algorithms in these applications, either to perform the required computations more quickly or to achieve more performance with less power consumption to ease the load on the data center’s electrical power and cooling, as well as the rack-space limitations.
One or both improvements—performance and performance per watt—are important for various applications.
New workloads targeted for acceleration include: Data storage and analyticsNetworking applications and cybersecurityMedia transcodingFinancial analysisGenomics All of these workloads employ algorithms that can be accelerated by other computational hardware, resulting in better data throughput and lower response latency.
FPGAs as Offload Accelerators A relatively new type of offload accelerator in current use is based on FPGAs.
FPGAs have a 30-year history throughout the electronics industry but their use as server accelerators in data centers is relatively new.
The hardware resources incorporated into FPGA hardware can be dynamically reprogrammed to exactly match the precise computational needs of a workload or algorithm and this close matching results in faster computational speed and lower power and energy consumption.
Like microprocessors, FPGAs are integrated circuits.
For server use, FPGAs are packaged as acceleration cards that plug into a slot in the server’s motherboard.
FPGA Use Cases Many companies are now developing workload-specific FPGA-accelerated applications.
For example: rENIAC has developed a distributed data engine that accelerates the Apache Cassandra NoSQL database by improving both throughput and latency.
The distributed Cassandra database application handles large amounts of data across many commodity servers while providing high availability.
The accelerated rENIAC version of the Apache Cassandra NoSQL database already has up to 4x the throughput compared to the unaccelerated version of the database, with further performance gains expected through additional development.
Myrtle has developed an FPGA-accelerated deep neural network (DNN) inference engine for machine-learning (ML) applications based on a speech-transcription model called DeepSpeech that has 165x throughput compared to a multi-core server CPU with a 1,000x improvement in performance per watt2.
In other words, from a performance perspective, one server using an FPGA-based accelerator card could replace many servers, greatly reducing Total Cost of Ownership (TCO) through capital expense and operational expense reductions.
Algo-Logic builds FPGA-accelerated financial trading solutions that have very low delay and time variation.
Speed directly translates into money for high-speed financial trading.
Algo-logic’s library of pre-built infrastructure software for financial trading can implement order-processing algorithms with sub-microsecond processing delays with just nanoseconds of time variation.
Algo-Logic’s Tick-To-Trade (T2T) System can read market data and execution reports, obtained directly from 10 Gbps SFP+ links, and then send buy/sell orders instantly back to the market over the same linksIBEX Technology is developing a growing number of media and video transcoding algorithms that are specifically adapted to FPGA-based acceleration hardware to meet the needs of massively multi-stream, mass-market video and gaming services such as YouTube and Twitch.
IBEX encoders and decoders cover a variety of video standards including HEVC/H.
264, MPEG-2, Sony XAVC, and Apple ProRes.
Adding Software into the FPGA Accelerator Solution for Greater Performance However, hardware alone is not sufficient to realize the acceleration benefits of an FPGA-based acceleration card.
You need a fair bit of software as well.
Traditionally, FPGAs are programmed with hardware description languages (HDLs)—typically Verilog or VHDL—but these HDLs are not in common use among software teams.
These teams are far more familiar with software programming languages such as C and C++.
For maximum programming efficiency, the application-development teams prefer to use their familiar programming languages and development tools.
This aspect of accelerated application development is critical to the adoption of FPGA-based accelerators.
Consequently, Intel has developed a software stack environment to help facilitate the integration of their programmable acceleration cards (PACs) into server environments.
The Intel FPGA Programmable Acceleration Card (Intel FPGA PAC) D5005 is supported by the Intel Acceleration Stack for Intel Xeon CPUs with FPGAs, which provides a common developer interface and includes drivers, application programming interfaces, and an FPGA Interface Manager.
The acceleration stack works with industry-leading operating systems and virtualization and orchestration software, providing a common interface for software developers to get fast time to revenue, simplified management, and access to a growing ecosystem of acceleration workloads.
The difference in effort between not having and having a software stack associated with an FPGA-based accelerator card is like the difference between using a hand crank to start an automobile engine and using an ignition key to start the engine.
For this reason, Hewlett Packard Enterprise (HPE) is now supplying Intel FPGA PAC D5005 accelerators, based on Intel Stratix® 10 SX FPGAs, as an installed option for its new line of ProLiant DL380* Gen10 servers (as well as individually for installed servers).
These servers are based on various first- and second-generation multicore Intel Xeon Scalable processors.
Adding the Intel FPGA PAC D5005 as an installed option to the ProLiant DL380 Gen10 Server family reduces installation times by relieving the end user of the need to install and configure the Intel acceleration card in the server and simplifies the ordering process by turning the acquisition into a one-stop shopping effort.
Intel accelerator cards open the HPE ProLiant DL380 Gen10 server’s adaptability to an even wider range of applications.
Software developers develop their application code, targeting the HPE ProLiant DL380 Gen10 server’s Intel Xeon Scalable processor.
They then migrate that code to the Intel FPGA PAC D5005 using the acceleration libraries and development tools within the Intel Acceleration Stack to achieve even higher performance.
For additional information: Intel Acceleration Hub – SolutionsHPE FPGA AcceleratorsHPE ProLiant DL380 Gen10 Servers Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation.
Performance varies depending on system configuration.
No product or component can be absolutely secure.
Check with your system manufacturer or retailer or learn more at intel.
Bill Mannel is Vice President & General Manager of High-Performance Computing and Artificial Intelligence at Hewlett Packard Enterprise.
.. More details