Skip to main content

Inference Performance and Economics at Scale on DigitalOcean

abstract background

Abstract

Every AI company hits an inference cost cliff. DigitalOcean collaborated with AMD to help customers like Character.AI cut inference costs by 50% on AMD Instinct GPUs. By integrating vLLM and a prefix-aware Inference Gateway, customers can achieve up to 4× lower cost per request. This session shows how full-stack optimization on AMD turns inference economics into competitive advantage.

July 22, 2026 4:00 PM - 4:45 PM PDT

Speakers


Presented By