2014년 3월 13일 목요일

6.1 Commonly used constraints.

There are a number of constraints of which only a few are used frequently. We’ll have a look at those constraints.

  1. Register operand constraint(r)When operands are specified using this constraint, they get stored in General Purpose Registers(GPR). Take the following example:

    asm ("movl %%eax, %0\n" :"=r"(myval));

    Here the variable myval is kept in a register, the value in register eax is copied onto that register, and the value of myval is updated into the memory from this register. When the "r" constraint is specified, gcc may keep the variable in any of the available GPRs. To specify the register, you must directly specify the register names by using specific register constraints. They are:


    +---+--------------------+
    | r |    Register(s)     |
    +---+--------------------+
    | a |   %eax, %ax, %al   |
    | b |   %ebx, %bx, %bl   |
    | c |   %ecx, %cx, %cl   |
    | d |   %edx, %dx, %dl   |
    | S |   %esi, %si        |
    | D |   %edi, %di        |
    +---+--------------------+
    
  2. Memory operand constraint(m)When the operands are in the memory, any operations performed on them will occur directly in the memory location, as opposed to register constraints, which first store the value in a register to be modified and then write it back to the memory location. But register constraints are usually used only when they are absolutely necessary for an instruction or they significantly speed up the process. Memory constraints can be used most efficiently in cases where a C variable needs to be updated inside "asm" and you really don’t want to use a register to hold its value. For example, the value of idtr is stored in the memory location loc:
    asm("sidt %0\n" : :"m"(loc));
  3. Matching(Digit) constraintsIn some cases, a single variable may serve as both the input and the output operand. Such cases may be specified in "asm" by using matching constraints.
    asm ("incl %0" :"=a"(var):"0"(var));

    We saw similar examples in operands subsection also. In this example for matching constraints, the register %eax is used as both the input and the output variable. var input is read to %eax and updated %eax is stored in var again after increment. "0" here specifies the same constraint as the 0th output variable. That is, it specifies that the output instance of var should be stored in %eax only. This constraint can be used:

    • In cases where input is read from a variable or the variable is modified and modification is written back to the same variable.
    • In cases where separate instances of input and output operands are not necessary.

    The most important effect of using matching restraints is that they lead to the efficient use of available registers.

Some other constraints used are:
  1. "m" : A memory operand is allowed, with any kind of address that the machine supports in general.
  2. "o" : A memory operand is allowed, but only if the address is offsettable. ie, adding a small offset to the address gives a valid address.
  3. "V" : A memory operand that is not offsettable. In other words, anything that would fit the `m’ constraint but not the `o’constraint.
  4. "i" : An immediate integer operand (one with constant value) is allowed. This includes symbolic constants whose values will be known only at assembly time.
  5. "n" : An immediate integer operand with a known numeric value is allowed. Many systems cannot support assembly-time constants for operands less than a word wide. Constraints for these operands should use ’n’ rather than ’i’.
  6. "g" : Any register, memory or immediate integer operand is allowed, except for registers that are not general registers.
Following constraints are x86 specific.

  1. "r" : Register operand constraint, look table given above.
  2. "q" : Registers a, b, c or d.
  3. "I" : Constant in range 0 to 31 (for 32-bit shifts).
  4. "J" : Constant in range 0 to 63 (for 64-bit shifts).
  5. "K" : 0xff.
  6. "L" : 0xffff.
  7. "M" : 0, 1, 2, or 3 (shifts for lea instruction).
  8. "N" : Constant in range 0 to 255 (for out instruction).
  9. "f" : Floating point register
  10. "t" : First (top of stack) floating point register
  11. "u" : Second floating point register
  12. "A" : Specifies the `a’ or `d’ registers. This is primarily useful for 64-bit integer values intended to be returned with the `d’ register holding the most significant bits and the `a’ register holding the least significant bits.


6.2 Constraint Modifiers.

While using constraints, for more precise control over the effects of constraints, GCC provides us with constraint modifiers. Mostly used constraint modifiers are
  1. "=" : Means that this operand is write-only for this instruction; the previous value is discarded and replaced by output data.
  2. "&" : Means that this operand is an earlyclobber operand, which is modified before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part of any memory address. An input operand can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written.The list and explanation of constraints is by no means complete. Examples can give a better understanding of the use and usage of inline asm. In the next section we’ll see some examples, there we’ll find more about clobber-lists and constraints.


7. Some Useful Recipes.

Now we have covered the basic theory about GCC inline assembly, now we shall concentrate on some simple examples. It is always handy to write inline asm functions as MACRO’s. We can see many asm functions in the kernel code. (/usr/src/linux/include/asm/*.h).

  1. First we start with a simple example. We’ll write a program to add two numbers.



    int main(void)
    {
            int foo = 10, bar = 15;
            __asm__ __volatile__("addl  %%ebx,%%eax"
                                 :"=a"(foo)
                                 :"a"(foo), "b"(bar)
                                 );
            printf("foo+bar=%d\n", foo);
            return 0;
    }
    


    Here we insist GCC to store foo in %eax, bar in %ebx and we also want the result in %eax. The ’=’ sign shows that it is an output register. Now we can add an integer to a variable in some other way.



     __asm__ __volatile__(
                          "   lock       ;\n"
                          "   addl %1,%0 ;\n"
                          : "=m"  (my_var)
                          : "ir"  (my_int), "m" (my_var)
                          :                                 /* no clobber-list */
                          );
    


    This is an atomic addition. We can remove the instruction ’lock’ to remove the atomicity. In the output field, "=m" says that my_var is an output and it is in memory. Similarly, "ir" says that, my_int is an integer and should reside in some register (recall the table we saw above). No registers are in the clobber list.
  2. Now we’ll perform some action on some registers/variables and compare the value.



     __asm__ __volatile__(  "decl %0; sete %1"
                          : "=m" (my_var), "=q" (cond)
                          : "m" (my_var) 
                          : "memory"
                          );
    


    Here, the value of my_var is decremented by one and if the resulting value is 0 then, the variable cond is set. We can add atomicity by adding an instruction "lock;\n\t" as the first instruction in assembler template.

    In a similar way we can use "incl %0" instead of "decl %0", so as to increment my_var.

    Points to note here are that (i) my_var is a variable residing in memory. (ii) cond is in any of the registers eax, ebx, ecx and edx. The constraint "=q" guarantees it. (iii) And we can see that memory is there in the clobber list. ie, the code is changing the contents of memory.
  3. How to set/clear a bit in a register? As next recipe, we are going to see it.


    __asm__ __volatile__(   "btsl %1,%0"
                          : "=m" (ADDR)
                          : "Ir" (pos)
                          : "cc"
                          );
    


    Here, the bit at the position ’pos’ of variable at ADDR ( a memory variable ) is set to 1 We can use ’btrl’ for ’btsl’ to clear the bit. The constraint "Ir" of pos says that, pos is in a register, and it’s value ranges from 0-31 (x86 dependant constraint). ie, we can set/clear any bit from 0th to 31st of the variable at ADDR. As the condition codes will be changed, we are adding "cc" to clobberlist.
  4. Now we look at some more complicated but useful function. String copy.


    static inline char * strcpy(char * dest,const char *src)
    {
    int d0, d1, d2;
    __asm__ __volatile__(  "1:\tlodsb\n\t"
                           "stosb\n\t"
                           "testb %%al,%%al\n\t"
                           "jne 1b"
                         : "=&S" (d0), "=&D" (d1), "=&a" (d2)
                         : "0" (src),"1" (dest) 
                         : "memory");
    return dest;
    }
    


    The source address is stored in esi, destination in edi, and then starts the copy, when we reach at 0, copying is complete. Constraints "&S", "&D", "&a" say that the registers esi, edi and eax are early clobber registers, ie, their contents will change before the completion of the function. Here also it’s clear that why memory is in clobberlist.

    We can see a similar function which moves a block of double words. Notice that the function is declared as a macro.


    #define mov_blk(src, dest, numwords) \
    __asm__ __volatile__ (                                          \
                           "cld\n\t"                                \
                           "rep\n\t"                                \
                           "movsl"                                  \
                           :                                        \
                           : "S" (src), "D" (dest), "c" (numwords)  \
                           : "%ecx", "%esi", "%edi"                 \
                           )
    

    Here we have no outputs, so the changes that happen to the contents of the registers ecx, esi and edi are side effects of the block movement. So we have to add them to the clobber list.
  5. In Linux, system calls are implemented using GCC inline assembly. Let us look how a system call is implemented. All the system calls are written as macros (linux/unistd.h). For example, a system call with three arguments is defined as a macro as shown below.



    #define _syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \
    type name(type1 arg1,type2 arg2,type3 arg3) \
    { \
    long __res; \
    __asm__ volatile (  "int $0x80" \
                      : "=a" (__res) \
                      : "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
                        "d" ((long)(arg3))); \
    __syscall_return(type,__res); \
    }
    


    Whenever a system call with three arguments is made, the macro shown above is used to make the call. The syscall number is placed in eax, then each parameters in ebx, ecx, edx. And finally "int 0x80" is the instruction which makes the system call work. The return value can be collected from eax.

    Every system calls are implemented in a similar way. Exit is a single parameter syscall and let’s see how it’s code will look like. It is as shown below.



    {
            asm("movl $1,%%eax;         /* SYS_exit is 1 */
                 xorl %%ebx,%%ebx;      /* Argument is in ebx, it is 0 */
                 int  $0x80"            /* Enter kernel mode */
                 );
    }
    


    The number of exit is "1" and here, it’s parameter is 0. So we arrange eax to contain 1 and ebx to contain 0 and by int $0x80, the exit(0) is executed. This is how exit works.

댓글 없음:

댓글 쓰기