------------------------------------------------------------------------
		     Introduction to GCC Inline Asm

			    By Robin Miyagi
			    
	   http://www.geocities.com/SiliconValley/Ridge/2544/
	   
			Wed Sep 13 19:18:50 UTC
------------------------------------------------------------------------

* `as' and AT&T Syntax
------------------------------------------------------------------------

    The  GNU C  Compiler uses  the assembler  `as' as  a  backend.  This
    assembler uses AT&T syntax.  Here is a brief overview of the syntax.
    For  more   information  about  `as',   look  in  the   system  info
    documentation.

    - as uses the form;

        nemonic source, destination (opposite to intel syntax)

    - as  prefixes registers  with `%',  and prefixes  numeric constants
      with `$'.

    - Effective addresses use the following general syntax;

        SECTION:DISP(BASE, INDEX, SCALE)

      As in other assemblers, any one or more of these components may be
      ommited,  within constraints  of valid  intel  instruction syntax.
      The above syntax was shamelessly  copied from the info pages under
      the i386 dependant features of as.

    - As suffixes  the assembler nemonics  with a letter  indicating the
      operand sizes  ('b' for  byte, 'w' for  word, 'l' for  long word).
      Read  the info  pages for  more information  such as  suffixes for
      floating point registers etc.

    Example code (raw asm, not gcc inline)
    --------------------------------------------------------------------
    movl %eax, %ebx     /* intel: mov ebx, eax */                       
    movl $56, %esi      /* intel: mov esi, 56 */                        
    movl %ecx, $label(%edx,%ebx,$4) /* intel: mov [edx+ebx*4+4], ecx */ 
    movb %ah, (%ebx)    /* intel: mov [ebx], ah */                      
    --------------------------------------------------------------------

    Notice that  as uses  C comment  syntax.  As can  also use  `#' that
    works the same way as `;' in most other intel assemblers.

    Above code in inline asm
    --------------------------------------------------------------------
      __asm__ ("movl %eax, %ebx\n\t"
	       "movl $56, %esi\n\t"
	       "movl %ecx, $label(%edx,%ebx,$4)\n\t"
	       "movb %ah, (%ebx)");
    --------------------------------------------------------------------
   
    Notice that in the above example, the __ prefixing and suffixing asm
    are not neccesary,  but may prevent name conflicts  in your program.
    You can read  more about this in [C  enxtensions|extended asm] under
    the info documentation  for gcc.

    Also notice the '\n\t' at the  end of each line except the last, and
    that each  line is  inclosed in quotes.   This is because  gcc sends
    each as instruction to as  as a string.  The newline/tab combination
    is required so that the lines are fed to as according to the correct
    format  (recall that  each line  in asssembler  is indented  one tab
    stop, generally 8 characters).

    You can also use labels from  your C code (variable names and such).
    In  Linux, underscores prefixing  C variables  are not  Necessary in
    your code; e.g.

       int main (void) {
    	   int Cvariable;	
    	   __asm__ ("movl Cvariable, %eax"); # Cvariable contents > eax
	   __asm__ ("movl $Cvariable, %ebx"); # ebx ---> Cvariable
       }

     Notice that  in the documentation for  DJGPP, it will  say that the
     underscore is  necessary.  The difference is do  to the differences
     between  djgpp RDOFF  format  and  Linux's ELF  format.   I am  not
     certain, but I think that the old Linux a.out object files also use
     underscores (please contact me if you have comments on this).

* Extended Asm
------------------------------------------------------------------------

    The code  in the  above example will  most probably  cause conflicts
    with the rest of your C code, especially with compiler optimizations
    (recall that gcc is an  optimizing compiler).  Any registers used in
    your code may be used to hold  C variable data from the rest of your
    program.  You  would not want  to inadvertently modify  the register
    without telling gcc to take  this into account when compiling.  This
    is where extended asm comes into play.

    Extended  asm   allows  you  to  specify   input  registers,  output
    registers, and clobbered registers  as interface information to your
    block of asm code.  You can even allow gcc to choose actual physical
    CPU   registers  automatically,   that  probably   fit   into  gcc's
    optimization  scheme better.  An  example will  demonstrate extended
    asm better.

    Example code
    --------------------------------------------------------------------
    #include <stdlib.h>
    
    int main (void) {
      int operand1, operand2, sum, accumulator;
    
      operand1 = rand (); operand2 = rand ();
      
      __asm__ ("movl %1, %0\n\t"
      	       "addl %2, %0"
	       : "=r" (sum)			/* output operands */
	       : "r" (operand1), "r" (operand2) /* input operands */
	       : "0");				/* clobbered operands */
      
      accumulator = sum;
      
      __asm__ ("addl %1, %0\n\t"
	       "addl %2, %0"
	       : "=r" (accumulator)
	       : "0" (accumulator), "g" (operand1), "r" (operand2)
	       : "0");
      return accumulator;
    }
    --------------------------------------------------------------------

    The  first  the line  that  begins  with  ':' specifies  the  output
    operands,  the second  indicates the  input operands,  and  the last
    indicates  the  clobbered  operands.   the  "r", "g",  and  "0"  are
    examples of  constraints.  Output constraints must  be prefixed with
    an '=',  as in  "=r" (= is  a constraint modifier,  indicating write
    only).  Input  and output constraints  must have its  correspoding C
    argument included with it enclosed  in parenthisis (this must not be
    done with  the clobbered line, I  figured this out after  an hour of
    fustration).  "r"  means assign a general register  register for the
    argument,  "g" means  to assign  any register,  memory  or immediate
    integer for this.

    Notice the use of "0", "1",  "2" etc.  These are used to ensure that
    when the  same variable is indicated  in more than one  place in the
    extended asm, that is variable is only `mapped' to one register.  If
    you had merely used another "r" for example, the compiler may or may
    not assign  this variable to the  same register as  before.  You can
    surmise from this that "0"  refers to the first register assigned to
    a variable,  "1" the second etc.   When these registers  are used in
    the asm code, they are refered to as "%0", "%1" etc.

    Summary of  constraints. (copied from the  system info documentation
    for gcc)
    --------------------------------------------------------------------
    `m'

        A memory operand  is allowed, with any kind  of address that the
    	machine supports in general.
    
    `o'

        A  memory  operand  is  allowed,  but only  if  the  address  is
	"offsettable".    This  means  that   adding  a   small  integer
	(actually, the width  in bytes of the operand,  as determined by
	its machine mode) may be added  to the address and the result is
	also a valid memory address.
    
	For example, an address which  is constant is offsettable; so is
	an address that is the sum of a register and a constant (as long
	as  a slightly  larger  constant  is also  within  the range  of
	address-offsets supported by  the machine); but an autoincrement
	or autodecrement  address is not  offsettable.  More complicated
	indirect/indexed  addresses  may   or  may  not  be  offsettable
	depending  on  the  other  addressing  modes  that  the  machine
	supports.
    
	Note that in  an output operand which can  be matched by another
	operand,  the   constraint  letter   `o'  is  valid   only  when
	accompanied by both `<'  (if the target machine has predecrement
	addressing)  and `>'  (if  the target  machine has  preincrement
	addressing).
    
    `V'

        A  memory operand  that  is not  offsettable.   In other  words,
	anything  that would  fit the  `m'  constraint but  not the  `o'
	constraint.
    
    `<'

        A   memory  operand   with   autodecrement  addressing   (either
	predecrement or postdecrement) is allowed.
    
    `>'

        A   memory  operand   with   autoincrement  addressing   (either
	preincrement or postincrement) is allowed.
    
    `r'

        A register operand  is allowed provided that it  is in a general
	register.
    
    `d', `a', `f', ...
    
	Other  letters can  be defined  in machine-dependent  fashion to
	stand for particular classes of registers.  `d', `a' and `f' are
	defined  on  the 68000/68020  to  stand  for  data, address  and
	floating point registers.
    
    `i'
    
	An  immediate  integer  operand  (one with  constant  value)  is
	allowed.  This includes symbolic  constants whose values will be
	known only at assembly time.
    
    `n'
    
	An  immediate integer  operand  with a  known  numeric value  is
	allowed.   Many systems  cannot support  assembly-time constants
	for  operands less  than  a word  wide.   Constraints for  these
	operands should use `n' rather than `i'.
    
    `I', `J', `K', ... `P'
    
	Other letters in  the range `I' through `P' may  be defined in a
	machine-dependent fashion  to permit immediate  integer operands
	with explicit integer values  in specified ranges.  For example,
	on the 68000, `I' is defined  to stand for the range of values 1
	to 8.  This is the range permitted as a shift count in the shift
	instructions.
    
    `E'

        An immediate  floating operand (expression  code `const_double')
	is allowed, but only if  the target floating point format is the
	same  as that  of the  host machine  (on which  the  compiler is
	running).
    
    `F'
    
	An immediate  floating operand (expression  code `const_double')
	is allowed.
    
    `G', `H'
    
	`G' and  `H' may  be defined in  a machine-dependent  fashion to
	permit  immediate  floating  operands  in particular  ranges  of
	values.
    
    `s'
    
	An  immediate integer  operand whose  value is  not  an explicit
	integer is allowed.
    	
	This might appear strange; if  an insn allows a constant operand
	with a value not known  at compile time, it certainly must allow
	any known value.   So why use `s' instead  of `i'?  Sometimes it
	allows better code to be generated.
    	
	For  example, on  the  68000  in a  fullword  instruction it  is
	possible to use an immediate operand; but if the immediate value
	is between  -128 and 127,  better code results from  loading the
	value into a  register and using the register.   This is because
	the  load  into  the  register   can  be  done  with  a  `moveq'
	instruction.   We arrange  for this  to happen  by  defining the
	letter `K' to mean "any  integer outside the range -128 to 127",
	and then specifying `Ks' in the operand constraints.
    
    `g'
    
	Any register,  memory or  immediate integer operand  is allowed,
	except for registers that are not general registers.
    	
    `X'
    
	Any operand whatsoever  is allowed, even if it  does not satisfy
	`general_operand'.  This is normally used in the constraint of a
	`match_scratch'  when  certain  alternatives will  not  actually
	require a scratch register.
    
    `0', `1', `2', ... `9'
    
	An operand that matches the specified operand number is allowed.
	If  a  digit is  used  together  with  letters within  the  same
	alternative, the digit should come last.
    	
	This is called a "matching  constraint" and what it really means
	is that the  assembler has only a single  operand that fills two
	roles considered separate in the  RTL insn.  For example, an add
	insn has two  input operands and one output  operand in the RTL,
	but on most CISC machines an add instruction really has only two
	operands, one of them an input-output operand:
    	
	     addl #35,r12
    	
	Matching  constraints  are used  in  these circumstances.   More
	precisely,  the  two  operands   that  match  must  include  one
	input-only operand  and one output-only  operand.  Moreover, the
	digit must  be a smaller number  than the number  of the operand
	that uses it in the constraint.
    	
	For operands  to match in  a particular case usually  means that
	they  are  identical-looking  RTL  expressions.  But  in  a  few
	special cases specific kinds  of dissimilarity are allowed.  For
	example, `*x' as an input operand will match `*x++' as an output
	operand.  For proper results  in such cases, the output template
	should always use the  output-operand's number when printing the
	operand.
    
    `p'
    
	An operand that  is a valid memory address  is allowed.  This is
	for "load address" and "push address" instructions.
    	
	`p' in  the constraint must be  accompanied by `address_operand'
	as  the  predicate   in  the  `match_operand'.   This  predicate
	interprets the mode specified in the `match_operand' as the mode
	of the memory reference for which the address would be valid.
    
    `Q', `R', `S', ... `U'
    
	Letters  in  the range  `Q'  through `U'  may  be  defined in  a
	machine-dependent fashion to  stand for arbitrary operand types.
	The machine  description macro `EXTRA_CONSTRAINT'  is passed the
	operand as its  first argument and the constraint  letter as its
	second operand.
    	
	A typical use for this  would be to distinguish certain types of
	memory references that affect other insn operands.
    	
	Do  not  define  these  constraint letters  to  accept  register
	references  (`reg'); the reload  pass does  not expect  this and
	would not handle it properly.
    
        In order to have valid assembler code, each operand must satisfy
	its constraint.   But a  failure to do  so does not  prevent the
	pattern  from applying  to  an insn.   Instead,  it directs  the
	compiler  to modify  the code  so  that the  constraint will  be
	satisfied.  Usually  this is done  by copying an operand  into a
	register.
    
        Contrast, therefore, the two instruction patterns that follow:
    
	 (define_insn ""
	   [(set (match_operand:SI 0 "general_operand" "=r")
		 (plus:SI (match_dup 0)
			  (match_operand:SI 1 "general_operand" "r")))]
	   ""
	   "...")
    
	which has two operands, one  of which must appear in two places,
	and
    
	 (define_insn ""
	   [(set (match_operand:SI 0 "general_operand" "=r")
		 (plus:SI (match_operand:SI 1 "general_operand" "0")
			  (match_operand:SI 2 "general_operand" "r")))]
	   ""
	   "...")
    
	which  has  three operands,  two  of  which  are required  by  a
	constraint to  be identical.  If  we are considering an  insn of
	the form
    
	 (insn N PREV NEXT
	   (set (reg:SI 3)
		(plus:SI (reg:SI 6) (reg:SI 109)))
	   ...)
    
	the first pattern would not apply at all, because this insn does
	not  contain two  identical subexpressions  in the  right place.
	The  pattern  would  say,  "That  does  not  look  like  an  add
	instruction; try other patterns."  The second pattern would say,
	"Yes, that's  an add instruction,  but there is  something wrong
	with it."   It would direct the  reload pass of  the compiler to
	generate  additional insns  to  make the  constraint true.   The
	results might look like this:
    
	 (insn N2 PREV N
	   (set (reg:SI 3) (reg:SI 6))
	   ...)
	 
	 (insn N N2 NEXT
	   (set (reg:SI 3)
		(plus:SI (reg:SI 3) (reg:SI 109)))
	   ...)
    
	It is up to you to make sure that each operand, in each pattern,
	has constraints that can handle any RTL expression that could be
	present for  that operand.   (When multiple alternatives  are in
	use, each pattern must, for each possible combination of operand
	expressions, have at least one alternative which can handle that
	combination of operands.)  The constraints don't need to *allow*
	any  possible  operand--when  this  is  the case,  they  do  not
	constrain--but they must at least point the way to reloading any
	possible operand so that it will fit.
    
        * If  the  constraint accepts  whatever  operands the  predicate
	  permits, there is no problem: reloading is never necessary for
	  this operand.
    
	  For  example, an operand  whose constraints  permit everything
	  except  registers  is  safe  provided  its  predicate  rejects
	  registers.
    
	  An  operand whose  predicate accepts  only constant  values is
	  safe provided its constraints  include the letter `i'.  If any
	  possible constant  value is  accepted, then nothing  less than
	  `i'  will do;  if the  predicate is  more selective,  then the
	  constraints may also be more selective.
    
        * Any operand  expression can be  reloaded by copying it  into a
	  register.  So  if an operand's constraints allow  some kind of
	  register, it  is certain to be  safe.  It need  not permit all
	  classes  of  registers;  the  compiler  knows how  to  copy  a
	  register into another register of the proper class in order to
	  make an instruction valid.
    
       	* A nonoffsettable  memory reference can be  reloaded by copying
 	  the address  into a register.   So if the constraint  uses the
 	  letter `o', all memory references are taken care of.
     
       	* A  constant operand  can be  reloaded by  allocating  space in
 	  memory  to hold it  as preinitialized  data.  Then  the memory
 	  reference can  be used  in place of  the constant.  So  if the
 	  constraint uses the letters  `o' or `m', constant operands are
 	  not a problem.
     
       	* If  the constraint permits  a constant  and a  pseudo register
 	  used in  an insn was not  allocated to a hard  register and is
 	  equivalent to  a constant, the register will  be replaced with
 	  the constant.  If the predicate does not permit a constant and
 	  the insn  is re-recognized for some reason,  the compiler will
 	  crash.  Thus  the predicate must always  recognize any objects
 	  allowed by the constraint.
    
	If  the operand's  predicate  can recognize  registers, but  the
	constraint does not permit them, it can make the compiler crash.
	When this operand happens to be a register, the reload pass will
	be  stymied, because it  does not  know how  to copy  a register
	temporarily into memory.
    
	If  the  predicate  accepts  a unary  operator,  the  constraint
	applies to the operand.  For  example, the MIPS processor at ISA
	level  3 supports  an instruction  which adds  two  registers in
	`SImode' to produce a `DImode' result, but only if the registers
	are  correctly  sign extended.   This  predicate  for the  input
	operands accepts a `sign_extend' of an `SImode' register.  Write
	the constraint to indicate the type of register that is required
	for the operand of the `sign_extend'.
    ------------------------------------------------------------------------

    The '='  in the  "=r" is  a constraint modifier,  you can  find more
    information  about  constraint  modifiers,  in the  gcc  info  under
    Machine Descriptions : Constraints : Modifiers.

    I strongly recommend reading  more in the system info documentation.
    If  you haven't  had  much  experience with  the  info reader  (also
    accesable through  emacs), learn  it, it is  an excellent  source of
    information.

    The gcc info  documentation also explains how to  use a specific CPU
    register for  a constraint for various hardware  including the i386.
    You  can  find  this  information   under  [gcc  :  Machine  Desc  :
    Constraints : Machine Constraints] in the info documentation.

    You can specify specific registers in your constraints, e.g. "%eax".

* __asm__ __volatile__
------------------------------------------------------------------------

    Because of  the compilers optimization mechanism, your  code may not
    appear at  exactly in the  location specified by the  programmer.  I
    may  even be interspersed  with the  rest of  the code.   To prevent
    this, you can  use __asm__ __volotile__ instead.  Like  the '__' for
    asm, these  are also not needed  for volatile, but  can prevent name
    conflicts.

========================================================================
comments and suggestions <deltak@telus.net>