Build Your Local Coding Copilot with AMD Radeon GPU Platform

Jun 11, 2024

Abstract background

Generative AI is changing the way software engineers work today. Did you know that you can build your own coding Copilot just using the AMD RadeonTM graphic card locally? That’s right. AMD provides powerful large-model inference acceleration capabilities through the latest and advanced AMD RDNATM architecture, that powers not just cutting-edge gaming but also high-performance AI experience. With the help of the open software development platform AMD ROCmTM, now it is possible for software developers to implement GPT-like code generation functions on the desktop machines. This blog will share with you how to build your personal coding Copilot with Radeon graphic card, Continue (name of an open-source integrated development environment, act as an extension of VSCode and JetBrains that enable developers to create their own modular AI software development system easily), and LM Studio plus the latest open-source large model Llama3.

Here is the recipe to set up the environment:

Item

Version

Character

URL

Windows

Windows11

Host

VSCode

Integrated Development Environment

Continue

Copilot Extension

https://www.continue.dev/

LM Studio

v0.2.20 ROCm

LLM inference server

support Llama3

https://lmstudio.ai/rocm

AMD Radeon 7000 Series

LLM Inference Accelerator

In this implementation, the LM Studio is used to deploy Llama3-8B as an inference server. The Continue extension connected to the LM Studio server plays as the copilot client in VSCode.

Zoom de imagen
AMD_AI_0-1718087026052.png

A Brief Structure of the Coding Copilot System

The latest version of LM Studio ROCm v0.2.22 supports AMD Radeon 7000 Series Graphics cards (gfx1030/gfx1100/gfx1101/gfx1102) and has added Llama3 to the support list. It also supports other SOTA LLMs like Mistral with awesome performance based on AMD ROCm.

Zoom de imagen
AMD_AI_1-1718087152972.png

Step1:Please followExperience Meta Llama 3 with AMD Ryzen™ AI and Radeon™ 7000 Series Graphicsto setup LM Studio with Llama3.

In addition to work as a standalone chatbot, LM Studio could also act as an inference server.Like shown in the picture below, just one-click on the Local Inference Server button at the left-hand side of LM Studio user interface with the LLM model, e.g. Llama3-8B selected, an OpenAI API HTTP inference service will be launched.The default port ishttp://localhost:1234

Zoom de imagen
AMD_AI_2-1718087215574.png

You may use the curl example code to verify the service with PowerShell.

Zoom de imagen
AMD_AI_3-1718087254054.png

Step2: Setup Continue in VSCode

Search and install the Continue extention in VSCode.

Zoom de imagen
AMD_AI_4-1718087308599.png

You will find out that Continue now works with LM Studio and other inference framework.

Zoom de imagen
AMD_AI_5-1718087336222.png

Refer tohttps://continuedev.netlify.app/model-setup/configurationto modify the config.json of Continue to set LMStudio as the default model provider. Find out config.json and add the contents as what have been highlighted in the picture below:

Zoom de imagen
AMD_AI_6-1718087409827.png

Then, choose the LM Studio as the copilot backend (at the lower-left corner of this UI). Then you can chat with Llama3 with Continue in VSCode.

Zoom de imagen
AMD_AI_7-1718087434596.png

Continue provides a button to copy the code from chat to code file.

Zoom de imagen
AMD_AI_8-1718087458455.png

Right click the mouse to trigger out the quick menu of Continue in the code editing windows.

Zoom de imagen
AMD_AI_9-1718087484019.png

At this point, the application of automatic coding using the Llama3 model in LM Studio through Continue has been successfully launched. Continue enables users to select the right AI model for working purpose, whether it's open-sourced or commercialized models, running on local machine or remotely, and used for chat, autocomplete, or embeddings. You may find out more usage of it fromhttps://continuedev.netlify.app/intro/.

Now, you have your own AI Copilot with AMD Radeon Graphic card. This is a very simple and easy-to-use implementation for many individual developers, especially those who do not yet have access to the cloud instance for large-scale AI inference calculations.

AMD ROCm open ecosystem is developing rapidly with the support of latest LLM by AMD GPU, and the excellent software applications such as LM Studio. If you need more information on AMD AI acceleration solutions and developer ecosystem plans, please send email to amd_ai_mkt@amd.com.

Share:

Article By


Related Blogs