Memory-Optimized AI Inference
LLM and vision inference where memory stalls dominate
When you need serious per-core performance, Atrevido brings 4-wide out-of-order execution, and optional tensor math—built to keep models fed and cores busy
High IPC, low stall time
Wide OOO pipeline plus Gazillion Misses™ to keep issuing while data streams in
Seamless scale-up
Coherent or non-coherent integration. Available in single or four core configurations (Atrevido A426 MP4)
Linux-ready and customizable
Tailor caches, widths, and memory paths without breaking the software stack
AI-ready out of the box
Seamlessly integrate AI capabilities into your existing workflow
High-Throughput OOO Pipeline
4-wide OOO pipeline with register renaming, branch speculation, and deep queues for AI throughput
Unblock Memory Bottlenecks
Gazillion Misses™ memory subsystem sustaining up to ~128 outstanding misses to pierce the memory wall
Integrated Tensor Power
Tensor Unit integrates seamlessly for high-utilization GEMMs
Multi-Fabric Coherency Design
Coherency-friendly design for CHI/NoC fabrics, or AXI for non-coherent deployments
Single software stack from CPU-only to CPU+Vector+Tensor (All-in-One IP)
Linux-ready with RVV and TU libraries
LLM and vision inference where memory stalls dominate
Recommendation & analytics pipelines needing wide OOO and vector/tensor throughput
AI kernels that benefit from coherent multi-core scale-up
Our IP is silicon-ready, and in silicon implementations. Speak to us about reference designs