Powering Open-Model Inference at Scale with TensorWave and Featherless AI

Name: Powering Open-Model Inference at Scale with TensorWave and Featherless AI
Start: 2026-07-23T13:00:00-07:00
End: 2026-07-23T13:45:00-07:00

The growth of open-weight models has created a new infrastructure challenge: serving many distinct models at scale without the constraints of closed ecosystems. This session shows how TensorWave built an AMD-first cloud on AMD Instinct and ROCm to support open-model inference at production scale, and how Featherless AI is applying that infrastructure to serve hundreds of models with practical lessons on throughput, cost, and ecosystem readiness.

July 23, 2026 1:00 PM - 1:45 PM PDT

CEO | Featherless AI

Founder & CEO | TensorWave

Director, Software Development, Silo AI | AMD

Topic

AI Training & Inference

AI Infrastructure

Session Type

Breakout Session

Supercomputing for All: Bringing Exascale Innovation to Enterprise AI

HPE and AMD have built some of the world's most advanced supercomputers, including Frontier and El Capitan. Join HPE leaders who helped deploy these systems and learn how the technologies, expertise, and operational practices developed for exascale computing are enabling enterprise AI. Discover how organizations can accelerate AI adoption, scale workloads, and reduce deployment risk.;HPE and AMD have built some of the world's most advanced supercomputers, including Frontier and El Capitan. Join HPE leaders who helped deploy these systems and learn how the technologies, expertise, and operational practices developed for exascale computing are enabling enterprise AI. Discover how organizations can accelerate AI adoption, scale workloads, and reduce deployment risk.

July 23, 2026
Domain-Specific AI at Scale: Open Models, Post-Training, and AI Infrastructure

Learn how domain-specific AI moves beyond generic models using post-training, domain evals, and scalable open infrastructure. Using Open Telco Models as a case study, this session covers curated data, reward loops, unified training and serving, and AMD Instinct/ROCm-based stacks for building specialized AI systems at enterprise scale.;Learn how domain-specific AI moves beyond generic models using post-training, domain evals, and scalable open infrastructure. Using Open Telco Models as a case study, this session covers curated data, reward loops, unified training and serving, and AMD Instinct/ROCm-based stacks for building specialized AI systems at enterprise scale.

July 23, 2026
Unlocking Secure Enterprise Intelligence at Scale with Cisco

As organizations transition from AI experimentation to production-scale infrastructure, demand for high-performance compute must be matched by security and reliability. This session explores Cisco's vision for secure, high-performance AI environments and a framework for accelerating AI deployment while mitigating risks associated with large-scale data processing. Learn how the Cisco UCS C845A M8 and AMD are enabling the next generation of enterprise AI.;As organizations transition from AI experimentation to production-scale infrastructure, demand for high-performance compute must be matched by security and reliability. This session explores Cisco's vision for secure, high-performance AI environments and a framework for accelerating AI deployment while mitigating risks associated with large-scale data processing. Learn how the Cisco UCS C845A M8 and AMD are enabling the next generation of enterprise AI.

July 23, 2026
From Models to Production—A Blueprint for AI at Scale

Moving AI from training to production takes more than GPUs. Hear how Microsoft and Chai AI built scalable AI infrastructure on Vultr using AMD Instinct GPUs and ROCm. Learn best practices for data locality, secure networking, Kubernetes orchestration, benchmarking, cost optimization, and scale-out operations. Leave with a practical blueprint for deploying fast, portable, production-ready AI workloads.;Moving AI from training to production takes more than GPUs. Hear how Microsoft and Chai AI built scalable AI infrastructure on Vultr using AMD Instinct GPUs and ROCm. Learn best practices for data locality, secure networking, Kubernetes orchestration, benchmarking, cost optimization, and scale-out operations. Leave with a practical blueprint for deploying fast, portable, production-ready AI workloads.

July 23, 2026
Zyphra: Large-Model Training Lessons on AMD

Learn what it took to train ZAYA1-74B, a 74B-parameter mixture-of-experts model, end-to-end on AMD Instinct MI300X. This session shares key engineering lessons from designing an efficient training stack, optimizing long-context performance, and building a reinforcement learning pipeline for math, code, and agentic AI workloads. Discover practical insights for training and deploying large AI models on AMD infrastructure.;Learn what it took to train ZAYA1-74B, a 74B-parameter mixture-of-experts model, end-to-end on AMD Instinct MI300X. This session shares key engineering lessons from designing an efficient training stack, optimizing long-context performance, and building a reinforcement learning pipeline for math, code, and agentic AI workloads. Discover practical insights for training and deploying large AI models on AMD infrastructure.

July 23, 2026
Right Size Your Memory Footprint to Move IT Refresh Forward

Memory has rarely been in such short supply and is impeding customer data center refresh plans. In this interactive conversation, we’ll discuss tips and tools for right-sizing memory configurations to help move your data center efficiency initiatives forward and preserve ROI. Bring your questions and our experts will provide answers!;Memory has rarely been in such short supply and is impeding customer data center refresh plans. In this interactive conversation, we’ll discuss tips and tools for right-sizing memory configurations to help move your data center efficiency initiatives forward and preserve ROI. Bring your questions and our experts will provide answers!

July 23, 2026
Training at Scale with AMD Primus

Primus makes large-scale training on Instinct reliable, debuggable and highly performant. It supports the latest OSS training frameworks, models, and is expanding support to new, cutting-edge model architectures, training techniques, and datatypes. SOTA pre and post training performance with Primus, proven at scales of thousands of GPUs, positions an AMD Instinct GPU as a competitive solution for model development at frontier labs, enterprises, and AI startups.;Primus makes large-scale training on Instinct reliable, debuggable and highly performant. It supports the latest OSS training frameworks, models, and is expanding support to new, cutting-edge model architectures, training techniques, and datatypes. SOTA pre and post training performance with Primus, proven at scales of thousands of GPUs, positions an AMD Instinct GPU as a competitive solution for model development at frontier labs, enterprises, and AI startups.

July 23, 2026
From Tokens to Outcomes: Driving AI ROI with Lenovo Hybrid Infrastructure

AI success is increasingly measured by business outcomes, not model size. As agentic AI accelerates inference demand, organizations must improve token efficiency, infrastructure utilization, and energy consumption to maximize ROI. Learn how Lenovo Hybrid AI Factories, powered by AMD, help enterprises deploy AI from personal systems to rack-scale environments while reducing token costs, increasing control and utilization, and supporting more sustainable AI growth.;AI success is increasingly measured by business outcomes, not model size. As agentic AI accelerates inference demand, organizations must improve token efficiency, infrastructure utilization, and energy consumption to maximize ROI. Learn how Lenovo Hybrid AI Factories, powered by AMD, help enterprises deploy AI from personal systems to rack-scale environments while reducing token costs, increasing control and utilization, and supporting more sustainable AI growth.

July 23, 2026

Powering Open-Model Inference at Scale with TensorWave and Featherless AI

Abstract

Speakers

Presented By

Related Sessions

AMD.com Feedback