“If all you have is a hammer, everything looks like a nail.”
“It’s not a one-size-fits-all world. There are a lot of different ways to solve a whole variety of AI problems.”
AI is here to stay. It’s already everywhere: tagging photos, answering voice commands, guiding financial advisors, reading X-rays, reshaping thousands of business applications – even helping to locate missing children. But the head-snapping velocity and variety that’s made artificial intelligence a technology and investment phenomenon in just a few short years has now produced growing pains.
Early enterprise adopters of AI and other AI builders are seeing model size grow, algorithms become more complex and the volume and variety of data exploding, requiring new technologies from devices to data centers. As the scope and complexity of AI-enabled workloads expands, so does the need for workload optimized solutions to fuel them. So right when more data scientists are sharing the wheel with everyday business leaders to new find new horizons, the computing industry is challenged to ensure that underlying infrastructure delivery meets the requirements of this rapidly advancing innovation.
The solution: Ditch outdated reflexes like looking at AI as a monolithic workload requiring one specific hardware solution. Instead, take a fresh, holistic look at the opportunity that hardware, software, and ecosystems represent when working in tandem across a wide range of workloads, algorithms and customer advancements from data center to edge.
Organizations that will succeed with AI in the new era already upon us will be those that create the most cost-efficient, capable, scalable, silicon infrastructures that can provide a solid foundation for advancing AI. The logical beginning: understanding the importance of a portfolio approach to AI chip architecture from AI-optimized CPUs to general purpose accelerators like GPUs, and FPGAs; to purpose-built ASICs including Intel’s forthcoming neural network processors. As Wei Li, Vice President of Intel Architecture, Graphics and Software, and General Manager of Machine Learning and Translation at Intel, puts it: “AI problems demand a variety of silicon.”
To better understand why multiple architectures a strategic key to AI success is – today and tomorrow – consider the biggest business and technology factors shaping adoption and implementation. And how modern options can help enterprises handle these powerful global trends.
Organizations are drowning in data — an estimated 90% generated in just the past two years. Analysts forecast worldwide data will grow tenfold by 2025, reaching 163 zettabytes. Yet only an estimated 2% has been analyzed, leaving a great untapped opportunity to propel business and fuel societal insights. In fact, much of the interest and activity in AI is driven by the desire to unlock business value from these growing torrents. According to Garter, through 2023, computational resources used in AI will increase 5x from 2018, making AI the top category of workloads driving infrastructure decisions. “All apps,” says Lisa Spelman, VP of the Intel Data Center Group and General Manager of Intel Xeon Systems, “will have AI built in.”
Indeed, a Deloitte & Touche global survey of 1,900 IT and Line of Business leaders in seven countries found: 61% are using machine learning 60% have adopted NLP 56% are using computer vision 51% are using deep learning
Making smart, strategic architecture choices starts with understanding the full range of modern silicon options available for enterprise AI tasks. Here’s a quick rundown of major choices and how each helps AI.
Central Processing Units (CPU) are super-fast generalists. Traditionally, they’re faster at executing a variety of tasks, but don’t have as many parallel execution units: Manage all input and output. NEXT! Run virtual memory! NEXT! Send files to a disk! NEXT! These flexible, multi-tasking generalists can be programmed to do basically anything very, very quickly. In AI, these traits make CPUs ideal for inferencing tasks and the complete pipeline of data processing pre-and post ML/DL inference, as well as the application that is using the inference results. CPUs scale throughout the compute infrastructure from workstations to affordable public cloud instances to deployments on edge servers, PCs and devices.
Graphic Processing Units (GPU) offer clock speeds that actually run slower than a CPU, but they contain thousands of cores (compared to tens of cores for a server CPU). So they’re very good at rapidly running a single mathematical operation over and over, on many, many, many pieces of data, making GPUs ideal for video rendering, gaming and, in AI, a host of functions, especially deep learning training.
Custom ASICs are highly optimized architectures for deep learning to improve performance and minimize power. Indeed, the AI landscape has evolved to where the industry is seeing today strong demand in in special purpose ASICs as AI is implemented. But moving forward, the diversity of applications incorporating AI will require a diversity of solutions. The custom ASIC category includes the forthcoming Intel Nervana Neural Network Processors for training (NNP-T) and inference (NNP-I), both of which Intel is on track to deliver to customers later this year. The category also features vision processing units (VPUs) like Intel’s Movidius solutions that deliver extreme, low-power inference for cameras and enable AI at the edge.
FPGA (Field Programmable Gate Arrays) provide excellent throughput and low latency for real-time inferencing. FPGAs have low latency, potentially more compute power for lower precision data types, and flexibility for custom operations and data types and some memory advantages. It’s one of the fastest-growing architectures, according to McKinsey, especially for edge inference and training.
Indeed, CPUs are the leading architecture for AI inference.; for example, Facebook runs nearly all of its inference on CPU. In fact, because AI inference is being integrated across a large array of workloads, AI inference is a naturally occurring workload for CPU architectures.
For organizations looking for modern AI platforms, updated general-purpose microprocessors provide an incredibly robust foundation. New more powerful chips like the 2nd-Generation Intel Xeon Scalable processors are ready and optimized for AI with huge memory, more cores, AI acceleration instructions, and increasingly optimized software.
AI-enabled. The 2nd Generation Intel Xeon Scalable processors contain Deep Learning Boost (DL Boost) technology, for built in inference acceleration, reducing the need to bolt on additional accelerators. For example, Microsoft has seen a 3.4X boost in image recognition, Target a 4.43X improvement in ML inference, and JD.com a 2.4X boost in text detection using Intel's DL Boost. Popular AI frameworks, like TensorFlow, PyTorch, Caffe, and MXNet, are being optimized for Intel DL Boost. Additionally, Cooper Lake, the Intel Xeon Scalable processor following 2nd generation Xeon Scalable (Cascade Lake), will be the first Xeon processor to deliver built-in high performance AI training acceleration through new bfloat16 support added to Intel DL Boost—further improving training performance.
New technology advances, along with a fast-widening set of use cases, have prompted many leading organizations to choose modern CPUs as their foundational AI processor of choice. Here are innovative examples of how CPUs are meeting the real-world demands of deep learning applications.
For iFLYTEK, Cloud Computing Research Institute, openness to a new way of expanding AI capacity produced a pleasant surprise: Intel Xeon Scalable Processors with DL Boost enabled on a real AI cloud workload matched or beat performance of their general- purpose GPU solution, saving money in the process, according to Zhang Zhiajang, vice dean.
For others, the pay-off is doing heavy-duty AI processing without buying new GPUs or other accelerators. Siemens Healthineers, for instance, leveraged its existing Intel CPU-infrastructure to run AI-inference workloads, including segmentation and analysis, in new models for diagnosing cardiovascular disease. “We can now develop multiple real-time, often critical medical imaging use cases, such as cardiac MRI and others, using Intel Xeon Scalable processors, without the added cost or complexity of hardware accelerators,” explains Dorin Comaniciu, senior vice president, Siemens Healthineers.
The ultimate show of confidence in a modern AI upgrade may be bringing it to a customer-facing software service. That’s what Taboola did for its publishing, marketing, and advertising clients, optimizing and speeding a custom TensorFlow Serving application with the Intel Math Kernel Library for Deep Neural Networks (Intel MKL-DNN) on Intel Xeon Scalable processors. Taboola evaluated GPUs and CPUs side by side as the company planned to scale out inference across seven data centers. Taboola found that even though performance was comparable, it lost precious time transferring data back and forth between GPU and CPU architectures, plus it cost less to just keep things on CPU overall. Result: Optimized, much faster delivery of AI apps via SaaS cloud platforms.
And the ability to use large-capacity memory and performant compute can open new deep learning training doors for enterprises. Pharmaceutical maker Novartis, for instance, wanted to use deep learning to accelerate the analysis of cell culture microscopy images, used to study the effects of various treatments, and to discover new therapies. The images were more than 26 times larger than those used in common deep learning benchmarks. Despite the size, more than 120 3.9-megapixel images per second were supported – in part due to the memory capacity, a key Xeon advantage. Overall, the team achieved a 20x improvement in time to train on a system.
New architectures also give enterprises another choice: creating sophisticated AI on standard, general-purpose CPU architectures, then moving to more specialized hardware when and if it makes sense. That’s especially appealing for tech-driven start-ups like DataCubes, Inc.
The Illinois company, founded in 2015 by an insurance industry executive, wanted to modernize commercial property and casualty insurance for small and medium business. “Underwriting is soul-sucking work,” says Phil Alampi, vice president of customer engagement. “If you walked into the typical office writing commercial P&C insurance today it wouldn’t look too different from the 1990s. All manual.” Errors were common; the process could take weeks – or longer.
So DataCubes began building a ”frictionless” AI system that would help carriers and agents speed the slow, tedious process of manually gathering, inputting and processing far-flung data needed to complete an application.
In 2018 the company created an AI system atop existing Intel Xeon CPU infrastructure, using sophisticated algorithms to automate data collection and the application process. DataCubes' platform, d3 CORE®, automates the intake of submission documents including PDFs, scans and other forms of unstructured data using machine learning models pretrained on thousands of sources. More than 4 billion data points are included in DataCubes' data lake of information on businesses drawn from government entities, public records, company websites and other sources. Results are delivered to clients via an online portal, email or API.
The system can automatically answer underwriting questions and provide risk assessments, all powered by machine learning. Carriers like that they can write policies quickly, which helps make them the carrier of choice, while improving their customer experience, underwriting productivity, and profitability. Today, DataCubes serves national and regional operators in 50 states under the motto: “Commercial Underwriting Powered by Data Science.” As the business grows over the next few years beyond hundreds of customers, DataCubes expects to build on the foundation created by Intel Xeon processors.
“We’ve been really successful with a generalist approach. We don’t have a hardware optimization team,” Alampi says. “But we’ve been able to be part of transforming an entire industry on top our existing infrastructure. That’s enabled a small start-up to make a big impact. You don’t need a big company with a huge technical staff to do something like this.”
“It’s not an A or B world. It’s A and B. There are at least 3,4,5 ways of doing what you are
trying to do.”
While AI has enjoyed astonishing growth over the last decade, it’s still in its infancy. The way we create and run AI systems today won’t efficiently take us into tomorrow.
For enterprises and the industry to advance and derive maximum benefit will require fresh approaches to new challenges. Success in the new, data-driven, “AI everywhere” environment will go to organizations that ignore simplistic and outdated “rules” about processor architecture, treating AI infrastructure as a strategic advantage and adopting a flexible, modern, portfolio approach to building a solid silicon foundation.