AMD x IIT-Delhi AI Reinforcement Learning Hackathon Recap
Mar 02, 2026
India’s AI community gathered in Delhi for an intensive two-day, hands-on workshop and hackathon, powered by the AMD Instinct™ MI300X GPUs, hosted in collaboration with IIT Delhi and Unsloth. The event brought together students, researchers, and industry practitioners with a shared goal: learning how to build real-world AI systems under real constraints.
Unlike typical hackathons that focus on prompt engineering or API-level demos, this event emphasized disciplined AI engineering, spanning reinforcement learning, fine-tuning, evaluation design, and systems thinking. The goal was clear: equip participants with the skills required to build AI systems that are robust, efficient, and production-ready.
Workshops: Hands-On Learning at Scale
The event kicked off with large-scale workshops attended by approximately 330 participants, each receiving dedicated access to the Instinct MI300X GPUs. The workshops were designed to be practical and progressive, enabling beginners to grasp core concepts while still challenging for advanced developers with system-level complexity.
Fine-Tuning LLMs with PEFT and Unsloth
A major highlight was the Unsloth fine-tuning workshop led by Daniel Han, CEO of Unsloth. Participants learned how to apply parameter-efficient fine-tuning (PEFT) techniques such as LoRA to adapt LLMs for task-specific use cases while dramatically reducing GPU memory requirements. Although Daniel participated virtually, the session received strong praise for its clear explanations, hands-on notebooks, and step-by-step guidance, and resources many participants continued building on after the event. Beginners found the material approachable, while advanced developers appreciated the focus on realistic engineering trade-offs.
Reinforcement Learning with GRPO
Another core workshop introduced Group Relative Policy Optimization (GRPO) through a hands-on project where participants trained LLMs to play and master the 2048 game. Participants explored reward normalization, gradient stability, sample-efficiency improvements over PPO-style approaches, evaluation pipeline design, and techniques for debugging training instability.
This session reinforced a central theme of the event: reinforcement learning as an engineering discipline, grounded in evaluation, stability, and system behavior, not just algorithms.
Hackathon Tracks: Engineering Under Constraints
The hackathon featured two tracks, with teams receiving access to the AMD Instinct MI300X GPUs.
- Track 1: AMD AI Premier League comprising of about 59 teams, focused on adversarial multi-agent systems, pairing a question-generating agent against an answering agent in a tournament-style setup.
- Track 2: Gaming the Models – RL for Minesweeper comprising of about 46 teams, challenged to train LLMs using GRPO-based reinforcement learning, incorporating supervised fine-tuning, synthetic data generation, logical deduction rewards, and latency-aware evaluation.
Across both tracks, teams were required to account for latency limits, token budgets, compute constraints, and structured reasoning enforcement, closely mirroring real production environments.
Hackathon Winners & Standout Submissions
The top teams distinguished themselves by building robust, production-ready AI systems rather than optimizing narrow benchmarks. Standout submissions commonly featured:
- Adversarial multi-agent architectures with self-play loops encouraged structured reasoning and iterative improvement.
- Synthetic data pipelines generated thousands of reasoning traces, enabling scalable alignment and self-supervised refinement.
- Latency-aware evaluation metrics optimized reasoning quality alongside computational efficiency.
- Strong emphasis on failure analysis and structured reasoning traces.
Winning teams demonstrated that engineering excellence and evaluation rigor are as important as model performance, highlighting the event’s goal of training participants to think like AI system architects.
Conclusion: Building Production-Ready AI Engineers
The impact of the event extended well beyond the final submissions. Community engagement was strong, with participants sharing their experiences and highlighting how the workshops reshaped their understanding of reinforcement learning, fine-tuning, and production-grade AI system design. The Unsloth workshop, in particular, received consistent praise for its practical notebooks and expert guidance, while many attendees noted how the structured tracks made advanced topics approachable without oversimplifying them.
More importantly, the event stood apart by emphasizing disciplined AI engineering under constraints. Participants were encouraged to reason about evaluation, infrastructure, and system behavior—thinking like system designers rather than model users. By providing access to high-end hardware and realistic workflows, the hackathon lowered barriers to experimenting with large models, strengthened the bridge between academia and industry, and fostered a systems-thinking mindset.
Participants left not just with new techniques, but with a deeper understanding of how real AI systems are built, evaluated, and deployed. In doing so, the AMD AI Reinforcement Learning Hackathon became more than a learning event—it served as a training ground for the next generation of production-ready AI engineers in India.
Ready to continue your AI journey? Enroll in the AMD AI Dev Program to get free Dev Cloud credits, access developer resources, and connect with our developer community. And don’t forget to save the date for AMD AI DevDay 2026—a must-attend event to explore the latest in AI infrastructure, hands-on workshops, and cutting-edge developer content.