Expedera Inc., emerging from stealth, today announced the availability of its Origin neural engine, the industry’s fastest and most energy-efficient AI inference IP for edge systems. The silicon-proven deep-learning accelerator (DLA) provides up to 18 TOPS/W at 7nm—up to ten times more than competitive offerings while minimizing memory requirements. Origin accelerates the performance of neural network models such as object detection, recognition, segmentation, super-resolution, and natural language processing. It is targeted for markets including mobile, consumer, industrial, and automotive.
AI processing is increasingly moving to the edge creating a skyrocketing demand for high performance, power-efficient silicon solutions. Smartphones, smart speakers, home security cameras, surveillance systems, and cars with advanced driver-assistance systems (ADAS) all use built-in deep learning accelerators. Requirements for edge AI processing are different than in the cloud due to constraints in power consumption, cooling, and cost of deployed products and vary widely depending on the application. Current solutions are unable to provide the required performance while keeping power at a minimum. Expedera addresses the diverse requirements of edge applications with its Origin family of IPs that enables configurable energy-efficient AI inference. A top 5 smartphone customer has already licensed the IP, validating this approach.
“Expedera has created the unique concept of native execution, which greatly simplifies the AI hardware and software stack. As a result, the architecture is much more efficient than the competition when measured in TOPS/W or, more important, IPS/W on real neural networks,” said Linley Gwennap, principal analyst at The Linley Group. “On either metric, Expedera’s design outperforms other DLA blocks from leading vendors such as Arm, MediaTek, Nvidia, and Qualcomm by at least 4–5x. This advantage is validated by measurements using Expedera’s 7nm test chip.”
“We’ve taken a novel approach to AI acceleration inspired by the team’s extensive background in network processing,” said Da Chuang, CEO and co-founder of Expedera. “We’ve created an AI architecture that allows us to load the entire network model as metadata and run it natively using very little memory. If you plot performance in terms of TOPS/W or ResNet-50 IPS/W you’ll see that all other vendors hit a wall around 4 TOPS/W or 550 IPS/W. However, we can break through the wall with 18 TOPS/W or 2000 IPS/W. As our hardware processes the model monolithically, we are not constrained by memory bandwidth and can scale up to over 100 TOPS.”
Technology Details and Specifications
Origin’s high TOPS/W and minimized memory requirement means that die area is reduced, bandwidth is significantly improved, and thermal design power (TDP) is reduced allowing passive cooling. All of this means lower cost silicon, low-cost bill of materials (BOM), and higher performance. Expedera’s scheduler operates on metadata which simplifies the software stack and requires only about 128 bytes of memory for control sequences per layer. Origin IP can be run in a “fire-and-forget” method, without interacting with the host processor.
Expedera Origin Product Families
Origin E2 is appropriate for low-power edge devices like smartphones and tablets. Available configurations include 2.25K, 4.5K, or 9K INT8 MACs.
Origin E6 offers higher performance for a wide variety of devices including smartphones, computers, edge servers, and automotive. Available configurations include 4.5K, 9K, or 18K INT8 MACs.
Origin E8 delivers performance for the most demanding applications including data centers and autonomous vehicles. Available configurations include 36K or 54K INT8 MACs.
Availability
Origin IP is available now. A test chip is available for evaluation and benchmarking purposes.