Introduction to GCC Inline Asm
------------------------------------------------------------------------
Introduction to GCC Inline Asm
By Robin Miyagi
http://www.geocities.com/SiliconValley/Ridge/2544/
Wed Sep 13 19:18:50 UTC
------------------------------------------------------------------------
* `as' and AT&T Syntax
------------------------------------------------------------------------
The GNU C Compiler uses the assembler `as' as a backend. This
assembler uses AT&T syntax. Here is a brief overview of the syntax.
For more information about `as', look in the system info
documentation.
- as uses the form;
nemonic source, destination (opposite to intel syntax)
- as prefixes registers with `%', and prefixes numeric constants
with `$'.
- Effective addresses use the following general syntax;
SECTION:DISP(BASE, INDEX, SCALE)
As in other assemblers, any one or more of these components may be
ommited, within constraints of valid intel instruction syntax.
The above syntax was shamelessly copied from the info pages under
the i386 dependant features of as.
- As suffixes the assembler nemonics with a letter indicating the
operand sizes ('b' for byte, 'w' for word, 'l' for long word).
Read the info pages for more information such as suffixes for
floating point registers etc.
Example code (raw asm, not gcc inline)
--------------------------------------------------------------------
movl %eax, %ebx /* intel: mov ebx, eax */
movl $56, %esi /* intel: mov esi, 56 */
movl %ecx, $label(%edx,%ebx,$4) /* intel: mov [edx+ebx*4+4], ecx */
movb %ah, (%ebx) /* intel: mov [ebx], ah */
--------------------------------------------------------------------
Notice that as uses C comment syntax. As can also use `#' that
works the same way as `;' in most other intel assemblers.
Above code in inline asm
--------------------------------------------------------------------
__asm__ ("movl %eax, %ebx\n\t"
"movl $56, %esi\n\t"
"movl %ecx, $label(%edx,%ebx,$4)\n\t"
"movb %ah, (%ebx)");
--------------------------------------------------------------------
Notice that in the above example, the __ prefixing and suffixing asm
are not neccesary, but may prevent name conflicts in your program.
You can read more about this in [C enxtensions|extended asm] under
the info documentation for gcc.
Also notice the '\n\t' at the end of each line except the last, and
that each line is inclosed in quotes. This is because gcc sends
each as instruction to as as a string. The newline/tab combination
is required so that the lines are fed to as according to the correct
format (recall that each line in asssembler is indented one tab
stop, generally 8 characters).
You can also use labels from your C code (variable names and such).
In Linux, underscores prefixing C variables are not Necessary in
your code; e.g.
int main (void) {
int Cvariable;
__asm__ ("movl Cvariable, %eax"); # Cvariable contents > eax
__asm__ ("movl $Cvariable, %ebx"); # ebx ---> Cvariable
}
Notice that in the documentation for DJGPP, it will say that the
underscore is necessary. The difference is do to the differences
between djgpp RDOFF format and Linux's ELF format. I am not
certain, but I think that the old Linux a.out object files also use
underscores (please contact me if you have comments on this).
* Extended Asm
------------------------------------------------------------------------
The code in the above example will most probably cause conflicts
with the rest of your C code, especially with compiler optimizations
(recall that gcc is an optimizing compiler). Any registers used in
your code may be used to hold C variable data from the rest of your
program. You would not want to inadvertently modify the register
without telling gcc to take this into account when compiling. This
is where extended asm comes into play.
Extended asm allows you to specify input registers, output
registers, and clobbered registers as interface information to your
block of asm code. You can even allow gcc to choose actual physical
CPU registers automatically, that probably fit into gcc's
optimization scheme better. An example will demonstrate extended
asm better.
Example code
--------------------------------------------------------------------
#include <stdlib.h>
int main (void) {
int operand1, operand2, sum, accumulator;
operand1 = rand (); operand2 = rand ();
__asm__ ("movl %1, %0\n\t"
"addl %2, %0"
: "=r" (sum) /* output operands */
: "r" (operand1), "r" (operand2) /* input operands */
: "0"); /* clobbered operands */
accumulator = sum;
__asm__ ("addl %1, %0\n\t"
"addl %2, %0"
: "=r" (accumulator)
: "0" (accumulator), "g" (operand1), "r" (operand2)
: "0");
return accumulator;
}
--------------------------------------------------------------------
The first the line that begins with ':' specifies the output
operands, the second indicates the input operands, and the last
indicates the clobbered operands. the "r", "g", and "0" are
examples of constraints. Output constraints must be prefixed with
an '=', as in "=r" (= is a constraint modifier, indicating write
only). Input and output constraints must have its correspoding C
argument included with it enclosed in parenthisis (this must not be
done with the clobbered line, I figured this out after an hour of
fustration). "r" means assign a general register register for the
argument, "g" means to assign any register, memory or immediate
integer for this.
Notice the use of "0", "1", "2" etc. These are used to ensure that
when the same variable is indicated in more than one place in the
extended asm, that is variable is only `mapped' to one register. If
you had merely used another "r" for example, the compiler may or may
not assign this variable to the same register as before. You can
surmise from this that "0" refers to the first register assigned to
a variable, "1" the second etc. When these registers are used in
the asm code, they are refered to as "%0", "%1" etc.
Summary of constraints. (copied from the system info documentation
for gcc)
--------------------------------------------------------------------
`m'
A memory operand is allowed, with any kind of address that the
machine supports in general.
`o'
A memory operand is allowed, but only if the address is
"offsettable". This means that adding a small integer
(actually, the width in bytes of the operand, as determined by
its machine mode) may be added to the address and the result is
also a valid memory address.
For example, an address which is constant is offsettable; so is
an address that is the sum of a register and a constant (as long
as a slightly larger constant is also within the range of
address-offsets supported by the machine); but an autoincrement
or autodecrement address is not offsettable. More complicated
indirect/indexed addresses may or may not be offsettable
depending on the other addressing modes that the machine
supports.
Note that in an output operand which can be matched by another
operand, the constraint letter `o' is valid only when
accompanied by both `<' (if the target machine has predecrement
addressing) and `>' (if the target machine has preincrement
addressing).
`V'
A memory operand that is not offsettable. In other words,
anything that would fit the `m' constraint but not the `o'
constraint.
`<'
A memory operand with autodecrement addressing (either
predecrement or postdecrement) is allowed.
`>'
A memory operand with autoincrement addressing (either
preincrement or postincrement) is allowed.
`r'
A register operand is allowed provided that it is in a general
register.
`d', `a', `f', ...
Other letters can be defined in machine-dependent fashion to
stand for particular classes of registers. `d', `a' and `f' are
defined on the 68000/68020 to stand for data, address and
floating point registers.
`i'
An immediate integer operand (one with constant value) is
allowed. This includes symbolic constants whose values will be
known only at assembly time.
`n'
An immediate integer operand with a known numeric value is
allowed. Many systems cannot support assembly-time constants
for operands less than a word wide. Constraints for these
operands should use `n' rather than `i'.
`I', `J', `K', ... `P'
Other letters in the range `I' through `P' may be defined in a
machine-dependent fashion to permit immediate integer operands
with explicit integer values in specified ranges. For example,
on the 68000, `I' is defined to stand for the range of values 1
to 8. This is the range permitted as a shift count in the shift
instructions.
`E'
An immediate floating operand (expression code `const_double')
is allowed, but only if the target floating point format is the
same as that of the host machine (on which the compiler is
running).
`F'
An immediate floating operand (expression code `const_double')
is allowed.
`G', `H'
`G' and `H' may be defined in a machine-dependent fashion to
permit immediate floating operands in particular ranges of
values.
`s'
An immediate integer operand whose value is not an explicit
integer is allowed.
This might appear strange; if an insn allows a constant operand
with a value not known at compile time, it certainly must allow
any known value. So why use `s' instead of `i'? Sometimes it
allows better code to be generated.
For example, on the 68000 in a fullword instruction it is
possible to use an immediate operand; but if the immediate value
is between -128 and 127, better code results from loading the
value into a register and using the register. This is because
the load into the register can be done with a `moveq'
instruction. We arrange for this to happen by defining the
letter `K' to mean "any integer outside the range -128 to 127",
and then specifying `Ks' in the operand constraints.
`g'
Any register, memory or immediate integer operand is allowed,
except for registers that are not general registers.
`X'
Any operand whatsoever is allowed, even if it does not satisfy
`general_operand'. This is normally used in the constraint of a
`match_scratch' when certain alternatives will not actually
require a scratch register.
`0', `1', `2', ... `9'
An operand that matches the specified operand number is allowed.
If a digit is used together with letters within the same
alternative, the digit should come last.
This is called a "matching constraint" and what it really means
is that the assembler has only a single operand that fills two
roles considered separate in the RTL insn. For example, an add
insn has two input operands and one output operand in the RTL,
but on most CISC machines an add instruction really has only two
operands, one of them an input-output operand:
addl #35,r12
Matching constraints are used in these circumstances. More
precisely, the two operands that match must include one
input-only operand and one output-only operand. Moreover, the
digit must be a smaller number than the number of the operand
that uses it in the constraint.
For operands to match in a particular case usually means that
they are identical-looking RTL expressions. But in a few
special cases specific kinds of dissimilarity are allowed. For
example, `*x' as an input operand will match `*x++' as an output
operand. For proper results in such cases, the output template
should always use the output-operand's number when printing the
operand.
`p'
An operand that is a valid memory address is allowed. This is
for "load address" and "push address" instructions.
`p' in the constraint must be accompanied by `address_operand'
as the predicate in the `match_operand'. This predicate
interprets the mode specified in the `match_operand' as the mode
of the memory reference for which the address would be valid.
`Q', `R', `S', ... `U'
Letters in the range `Q' through `U' may be defined in a
machine-dependent fashion to stand for arbitrary operand types.
The machine description macro `EXTRA_CONSTRAINT' is passed the
operand as its first argument and the constraint letter as its
second operand.
A typical use for this would be to distinguish certain types of
memory references that affect other insn operands.
Do not define these constraint letters to accept register
references (`reg'); the reload pass does not expect this and
would not handle it properly.
In order to have valid assembler code, each operand must satisfy
its constraint. But a failure to do so does not prevent the
pattern from applying to an insn. Instead, it directs the
compiler to modify the code so that the constraint will be
satisfied. Usually this is done by copying an operand into a
register.
Contrast, therefore, the two instruction patterns that follow:
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r")
(plus:SI (match_dup 0)
(match_operand:SI 1 "general_operand" "r")))]
""
"...")
which has two operands, one of which must appear in two places,
and
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r")
(plus:SI (match_operand:SI 1 "general_operand" "0")
(match_operand:SI 2 "general_operand" "r")))]
""
"...")
which has three operands, two of which are required by a
constraint to be identical. If we are considering an insn of
the form
(insn N PREV NEXT
(set (reg:SI 3)
(plus:SI (reg:SI 6) (reg:SI 109)))
...)
the first pattern would not apply at all, because this insn does
not contain two identical subexpressions in the right place.
The pattern would say, "That does not look like an add
instruction; try other patterns." The second pattern would say,
"Yes, that's an add instruction, but there is something wrong
with it." It would direct the reload pass of the compiler to
generate additional insns to make the constraint true. The
results might look like this:
(insn N2 PREV N
(set (reg:SI 3) (reg:SI 6))
...)
(insn N N2 NEXT
(set (reg:SI 3)
(plus:SI (reg:SI 3) (reg:SI 109)))
...)
It is up to you to make sure that each operand, in each pattern,
has constraints that can handle any RTL expression that could be
present for that operand. (When multiple alternatives are in
use, each pattern must, for each possible combination of operand
expressions, have at least one alternative which can handle that
combination of operands.) The constraints don't need to *allow*
any possible operand--when this is the case, they do not
constrain--but they must at least point the way to reloading any
possible operand so that it will fit.
* If the constraint accepts whatever operands the predicate
permits, there is no problem: reloading is never necessary for
this operand.
For example, an operand whose constraints permit everything
except registers is safe provided its predicate rejects
registers.
An operand whose predicate accepts only constant values is
safe provided its constraints include the letter `i'. If any
possible constant value is accepted, then nothing less than
`i' will do; if the predicate is more selective, then the
constraints may also be more selective.
* Any operand expression can be reloaded by copying it into a
register. So if an operand's constraints allow some kind of
register, it is certain to be safe. It need not permit all
classes of registers; the compiler knows how to copy a
register into another register of the proper class in order to
make an instruction valid.
* A nonoffsettable memory reference can be reloaded by copying
the address into a register. So if the constraint uses the
letter `o', all memory references are taken care of.
* A constant operand can be reloaded by allocating space in
memory to hold it as preinitialized data. Then the memory
reference can be used in place of the constant. So if the
constraint uses the letters `o' or `m', constant operands are
not a problem.
* If the constraint permits a constant and a pseudo register
used in an insn was not allocated to a hard register and is
equivalent to a constant, the register will be replaced with
the constant. If the predicate does not permit a constant and
the insn is re-recognized for some reason, the compiler will
crash. Thus the predicate must always recognize any objects
allowed by the constraint.
If the operand's predicate can recognize registers, but the
constraint does not permit them, it can make the compiler crash.
When this operand happens to be a register, the reload pass will
be stymied, because it does not know how to copy a register
temporarily into memory.
If the predicate accepts a unary operator, the constraint
applies to the operand. For example, the MIPS processor at ISA
level 3 supports an instruction which adds two registers in
`SImode' to produce a `DImode' result, but only if the registers
are correctly sign extended. This predicate for the input
operands accepts a `sign_extend' of an `SImode' register. Write
the constraint to indicate the type of register that is required
for the operand of the `sign_extend'.
------------------------------------------------------------------------
The '=' in the "=r" is a constraint modifier, you can find more
information about constraint modifiers, in the gcc info under
Machine Descriptions : Constraints : Modifiers.
I strongly recommend reading more in the system info documentation.
If you haven't had much experience with the info reader (also
accesable through emacs), learn it, it is an excellent source of
information.
The gcc info documentation also explains how to use a specific CPU
register for a constraint for various hardware including the i386.
You can find this information under [gcc : Machine Desc :
Constraints : Machine Constraints] in the info documentation.
You can specify specific registers in your constraints, e.g. "%eax".
* __asm__ __volatile__
------------------------------------------------------------------------
Because of the compilers optimization mechanism, your code may not
appear at exactly in the location specified by the programmer. I
may even be interspersed with the rest of the code. To prevent
this, you can use __asm__ __volotile__ instead. Like the '__' for
asm, these are also not needed for volatile, but can prevent name
conflicts.
========================================================================
comments and suggestions <deltak@telus.net>