Vision at scale
Efficient, low-latency AI inference, enabling massive deployment of vision processing
Scaled for heavier inference and higher throughput
Boost AI throughput via a unified fabric that saturates bandwidth for maximum compute density
Same code, more performance
Move from edge to rack under one architecture
Transformer‑ready
Gazillion™ helps absorb KV‑cache bandwidth so utilization stays high
64-bit RISC-V AI Accelerator
64‑bit RISC‑V CPU + RVV 1.0 Vector + programmable Tensor unit
Sustained DRAM bandwidth
Gazillion™ memory streaming for sustained DRAM bandwidth
Simple Linux Programming
Linux‑ready bring‑up; unified programming model with C1
Efficient, low-latency AI inference, enabling massive deployment of vision processing
Enables quick, personalized user recommendations by accelerating core ML algorithms
Dedicated acceleration of matrix multiplication for high-throughput, low-latency LLM/Transformer inference
Our IP is silicon-ready, and in silicon implementations. Speak to us about reference designs