Porting x86 Linux device drivers to AMD64 Technology

The AMD64 instruction set architecture is a 64-bit extension of the x86 instruction set. AMD64 processors can run in longmode, with AMD64 instructions and 64 bit registers and addressing, or in compatibility mode, with x86 instructions and 32 bit registers and addressing. Although the processor can schedule processes to run in compatibility mode while the operating system runs in long mode, a single process cannot have both x86 and AMD64 code segments linked together, nor can a single process run in both long and compatibility modes. This means that the AMD64 Linux kernel cannot use x86 drivers unless the drivers are recompiled to the AMD64 architecture. Recompiling a well-designed driver to AMD64 architecture should be a painless process.

Device driver developers should be aware of the following issues in the development or porting of a driver to make sure it is well-designed.

Data Types

The Linux ABI for x86 specifies that long, int, and pointer data types are all 4 bytes long. The ABI for AMD64 architecture also specifies int as 4 bytes, but long and pointer data types are each 8 bytes long. The common x86 practice of using an unsigned int to represent a pointer causes the most bugs when recompiling drivers to AMD64 architecture. The correct Linux idiom is to use an unsigned long, which is defined to be the size of an general purpose register in the current architecture.

One important example of this rule is that the argument for save_flags() and irq_spinlock_save() is unsigned long, not unsigned int. Using an unsigned int in these macros will work under x86, but will cause errors in AMD64 architecture.

The intended size of most variables should not change when moving to AMD64. The Linux kernel provides standard definitions in include/linux/typedef.h:

DeclarationMeaning
u8unsigned byte (8 bits)
u16unsigned word (16 bits)
u32unsigned 32-bit value
u64unsigned 64-bit value
s8signed byte (8 bits)
s16signed word (16 bits)
s32signed 32-bit value
s64signed 64-bit value

A simple rule of thumb is to use unsigned long to define pointers or values that are the same size as a general purpose register, and the above 8 types to define values that do not change, such as control registers or network packet field sizes.

Dynamic DMA Mapping

Direct memory access in Linux for AMD64 technology requires the use of the dynamic DMA mapping API. This API provides a way to efficiently map 64-bit physical addresses to 32-bit Single Address Cycle PCI addresses, so that hardware that only supports 32-bit accesses can be used in 64-bit systems. A full description of the dynamic DMA mapping API is beyond the scope of this document. Device driver developers who need to use DMA should read DMA-mapping.txt in the /usr/src/linux/Documentation directory.

Assembly

AMD64 architecture assembly statements are not always in the same format as x86 assembly statements. A driver that uses inline assembly in x86 may have to rewrite part or all of the inline assembly to compile for AMD64 architecture.

Privileged Instructions

Privileged instructions are operations such as CPUID or WRMSR that can only execute in Privilege Ring 0. AMD recommends that the standard Linux API for these instructions be used instead. The Linux API implementation of these instructions is portable across architectures and makes it easier to maintain the driver. API definitions of common privileged instructions are found in the following include files.

InstructionAPI Location
CPUIDasm/processor.h
RDMSRasm/msr.h
WRMSRasm/msr.h
RDTSCasm/msr.h
RDPMCasm/msr.h
INBasm/io.h
OUTBasm/io.h
TLB Flushasm/pgtable.h

Prefetch Instructions

The assembly instructions that force prefetching have not changed between x86 and AMD64 assembly and may be used safely. However, there is a Linux API for prefetch that provides better support across multiple architectures. The prefetech API is defined in asm/processor.h include file.

Semaphores and Spinlocks

Some drivers use inline assembly to implement semaphores or spinlocks. This practice is strongly discouraged by AMD and the Linux community. Please use the standard Linux APIs provided in linux/semaphore.h and linux/spinlock.h. These APIs are portable across all architectures and carefully designed to avoid race conditions.

Hand coded assembly optimization

Some drivers may use hand coded assembly to increase performance in selected code paths. Discussing the intricacies of writing optimal AMD64 assembly is beyond the scope of this document. Developers interested in this subject should look at volumes 2 and 3 of the AMD64 Architecture Programmer's Manual, available from http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf (volume 2) and http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24594.pdf (volume 3), and the Application Binary Interface (ABI) specification, available at http://old.x86-64.org/abi.pdf.

Compiling the Driver

The Linux Application Binary Interface (ABI) document for AMD64 architecture allows for a 128 byte "red zone" below the stack pointer that can be used for local variables and will not be touched by signals. The GNU C Compiler (gcc) uses the red zone by default in version 3.3, though not in version 3.2 or earlier. With either version of the compiler, the kernel cannot use the red zone because hardware interrupts can corrupt it. All kernel and driver code must be compiled with the -mno-red-zone flag.

gcc will not correctly compile kernel dynamically loadable modules without the -mcmodel=kernel flag. This is done automatically for modules compiled in the main kernel source tree. If the same Makefile is used to compile both x86 and AMD64 code outside the main kernel source tree, the following snippet should be added to the Makefile:

SUBARCH := $(shell uname -m)

ifeq ($(SUBARCH),x86_64)
CFLAGS += -mcmodel=kernel -mno-red-zone
endif

Also, AMD64 drivers must be compiled against the include files for the currently running kernel. The correct value for the include flag is:

-I/lib/modules/`uname -r`/build/include

This will automatically point the compiler to the correct include files.

Compiling the Driver in 2.6

The 2.6 release of the Linux kernel provides a way to use the main kernel build system on driver modules that are not part of the official kernel. Driver code that has a kernel-compatible Makefile can be compiled with the kernel build system with the following command:

make -C /lib/modules/`uname -r`/build SUBDIRS=`pwd` modules

Kernel compatible Makefiles are described in /usr/src/linux/Documentation/kbuild and examples exist through the Linux source code. A very simple example for 2.6 follows:

# Simple sample kernel Makefile
# compiles sample1.c and sample2.c into object files and links them into
# sampledrv.o
# optionally, setting the CONFIG_SAMPLE_SPECIAL compiles the special.c file
# and links it into sampledrv.o

sampledrv-y :=	sample1.o sample2.o

obj-$(CONFIG_SAMPLE_SPECIAL) := special.o

General Compilation Advice

Modules should be compiled with -Wall and should compile cleanly. Warnings during compilations that are acceptable under x86 will often cause problems under AMD64 technology.

Selecting a kernel and toolchain

Not all versions of the Linux kernel and the GNU compiler toolchain support AMD64 architecture. The first mainline kernel to fully support AMD64 architecture was the 2.4.19 release. All later kernels, including all 2.6 based kernels, support AMD64 architecture. The GNU C compiler started supporting AMD64 architecture with the gcc-3.2 release, though gcc-3.3 and later releases provide better code optimization.

Major distributions that support AMD64 technology include SUSE SLES 8, SUSE SL 9.0 Professional, Red Hat RHEL3, Turbolinux 8, and Mandrakesoft 9.2.

Acknowledgements

Parts of this paper were based on Greg Kroah-Hartman's Writing Portable Device Drivers and on Andreas Jaeger's Porting to 64-bit GNU/Linux Systems. Andi Kleen and Vojtech Pavlik of SUSE contributed valuable advice and experience.

Copyright 2004 Advanced Micro Devices, Inc.
AMD, the AMD Arrow logo, and all combinations thereof are trademarks of Advanced Micro Devices, Inc.

Last updated 3/11/2004