Rethinking AI from Silicon to Systems: Efficiency will Define the Next Era of Intelligence

Apr 29, 2026

AMD Chief Technology Officer Mark Papermaster and Liquid AI CEO Ramin Hasani explore how efficient AI architecture can unlock scalable, enterprise ready intelligence across PCs, devices and the edge.

As generative AI races from novelty to necessity, the industry is confronting a hard reality: building bigger models alone is not a sustainable path forward. Compute intensity, energy consumption and latency are critical. The next phase of AI will be defined by scale, but also by efficiency – how intelligently systems are designed from silicon through software to deliver real world value.

Efficiency was among the central themes of a recent episode of AMD’s Advanced Insights series with Mark Papermaster, executive vice president and chief technology officer at AMD, and Ramin Hasani, co founder and CEO of Liquid AI. Their conversation offered a view into how first principles thinking is reshaping AI inference and enabling intelligence to run where it matters most: on devices, close to users and data.

Moving Beyond Cloud Only AI

For much of the past decade, AI innovation has been synonymous with hyperscale data centers and ever larger foundation models trained on vast GPU clusters. While that approach has driven remarkable breakthroughs, both leaders agreed it represents only part of the AI opportunity.

Hasani explained that Liquid AI was founded around a different question: Where should intelligence live? “It’s not just about quality,” he said. “We think about where AI is served, the latency you get, how much battery it consumes and what that means for real hardware outside the data center.”

As AI expands into PCs, edge devices and embedded systems, the economics of inference – power efficiency, responsiveness and cost – become paramount. Papermaster said AI efficiency is already visible in the cloud, with models shrinking while delivering higher accuracy and larger context windows. Today’s challenge is to extend that efficiency farther from data centers to systems constrained by power, thermals and footprint.

Liquid AI’s approach centers on building compact, specialized foundation models designed with hardware in the loop from the start. Instead of trillion parameter architectures, the company focuses on models orders of magnitude smaller that are optimized to run efficiently on processors such as neural processing units (NPUs).

“NPUs are where you can sustain AI workloads at very low power,” Hasani said. “That’s critical for battery health and for enabling intelligence that runs continuously in the background.”

Papermaster said this philosophy aligns with AMD’s long standing focus on holistic design. From CPUs and GPUs to NPUs and system architectures, AMD engineers optimize performance per watt across the computing stack so AI can move out of the data center and into everyday systems.

From Reactive to Proactive AI Agents

They also discussed the shift from reactive AI assistants to proactive, always on agents. When AI runs locally, efficiently and offline, it can observe context continuously and act without explicit prompts.

Papermaster pointed to early examples of agentic AI tools gaining rapid adoption, noting how quickly developers are embracing workflows that chain together multiple models and tasks. And Hasani described a future where multiple specialized models work together on a single system, each optimized for a specific function: “It’s not one model doing everything. It’s an orchestration of intelligent components running efficiently on the device.”

Running AI locally also benefits security and privacy, creating a natural air gap that gives users and enterprises greater control over what data leaves the system and when cloud connectivity is required.

Efficiency as a Sustainability Imperative

Beyond performance, both leaders underscored the societal importance of efficient AI. As global demand for AI accelerates, energy consumption is becoming a constraint at both the infrastructure and policy levels.

Papermaster acknowledged that large scale data centers will remain essential for training and complex workloads. But he argued that shifting appropriate inference tasks to efficient devices can dramatically reduce overall energy demand while expanding access to AI capabilities.

Hasani’s long term vision is planet scale deployment of efficient AI across billions of processors, from laptops and phones to vehicles, robotics and industrial systems. “Democratizing AI depends on making it efficient enough to run everywhere,” he said.

Papermaster returned to a theme that has long defined AMD’s technology strategy: end to end optimization. From silicon design to application software, efficiency is the outcome of intentional, integrated engineering.

A Blueprint for the Next Phase of AI

The discussion with AMD and Liquid AI leaders illustrated that principle in action. By rethinking AI architectures around real hardware constraints, the industry can unlock scalable, sustainable intelligence that reaches beyond the data center.

In an era where AI impact will be measured not just by capability but by accessibility, AMD’s leadership perspective is clear: The future of AI belongs to systems that are efficient by design, open by nature and built to run wherever computing happens.

Watch: To see all of the Advanced Insights videos, visit the YouTube channel

Share:

Article By


Related Blogs