English
|
简体中文
|
한국어

Products
Cervell™ NPU All-In-One RISC-V NPU
RISC-V CPU Semidynamics RISC-V Cores
- Atrevido 423 RISC-V CPU Out-Of-Order
- Avispado 222 RISC-V CPU In-Order
CPU + Vector Scalable Performance

CPU + Vector + Tensor One Platform for Complete AI Acceleration

Customization Experts in open core surgery
Technologies

All-In-One AI Compute Platform

Gazzillion Misses™ Zero Latency, Max Bandwidth

Vector Unit Fully-custom RISC-V Vector Unit

Tensor Unit Fully-coherent RISC-V Tensor Unit
Resources
Evaluation Effortless Cloud-Based Solution

Configurator Create Your Custom Setup
AI Software Build AI Solutions
About us

About About Semidynamics

Team Meet our Team

Our office We are Located in Sunny Barcelona

Membership Membership Organizations

Public Grants European/Spanish Public Grants

Corporate Social Responsibility Our Commitment to building a better future

FAQ Guide to common inquiries
Newsroom

Press Releases All News Releases

In The Media Semidynamics at Media

Events Next Semidynamics Events

Blog Tech and Insights

Videos & Presentations Semidynamics Visual Content

Contributions Semidynamics Researches
Hiring

Careers Feel the spark? Join us!

Master Program Master's Program at Semidynamics

Student Internship Program Internship Opportunities at Semidynamics
Contact us!

Tensor Unit

World’s first fully programmable RISC-V Tensor Unit

The Tensor Unit handles matrix operations essential for AI, seamlessly integrating with the Vector Unit

What is the Tensor Unit?

The bulk of computations in Large Language Models (LLMs) is in fully-connected layers that can be efficiently implemented as matrix multiplication. The Tensor Unit provides hardware specifically tailored to matrix multiplication workloads, resulting in a huge performance boost for AI without a big power consumption.

Fully-coherent RISC-V Tensor Unit

Directly connected to the Vector Unit

Ultra-Fast AI

Delivers Ultra-Fast AI solutions

64-bit Core
Integration

Optimised to integrate with our 64-bit cores

Vector Unit and
Gazzillion Misses™ Integration

Seamless integration with our Vector Unit and Gazzillion Misses™

Universal RISC-V Compatibility

Works under any RISC-V vector-enabled Linux without any changes

DMA-Free
Programming

Easy to program as no DMAs needed

Power
Efficiency

Low power consumption

The Tensor Unit is designed to fully integrate with our other innovative technologies to provide solutions with outstanding AI performance.

First, at the heart, is our 64-bit fully customisable RISC-V core. Then our Vector Unit which is constantly fed data by our Gazzillion Misses™ technology, effectively hiding memory latency. And then the Tensor Unit is directly connected to the vector registers in the vector unit. The Tensor unit performs matrix multiplications required by AI. Every stage of this solution has been designed to be fully integrated with the others for optimal AI performance and very easy programming. The result is a performance increase of 128x compared to just running the AI software on the scalar core.

The world wants super-fast AI solutions and that is what our unique set of technologies can now provide.

Very Dense Compute

The Tensor Unit is built on top of the Semidynamics RVV1.0 Vector Processing Unit and leverages the existing vector registers to store matrices as shown below. This enables the Tensor Unit to be used for layers that require matrix multiply capabilities, such as Fully Connected and Convolution, and use the Vector Unit for the activation function layers (ReLU, Sigmoid, Softmax, etc), which is a big improvement over stand-alone NPUs that usually have trouble dealing with activation layers.

The Tensor Unit leverages both the Vector Unit capabilities as well as the Atrevido-423 Gazzillion Misses™ capabilities to fetch the data it needs from memory. Tensor Units consume data at an astounding rate and, without Gazzillion, a normal core would not keep up with the Tensor Unit’s demands. Other solutions rely on difficult-to-program DMAs to solve this problem. Instead, Semidynamics seamlessly integrates the Tensor Unit into its cache-coherent subsystem, opening a new era of programming simplicity for AI software.

In addition, because the Tensor Unit uses the vector registers to store its data and does not include new, architecturally-visible state, it seamlessly works under any RISC-V vector-enabled Linux without any changes.