Monday, November 19, 2012

Assembly

Check out the x86-64 ABI (or I guess analogous documents, assuming they exist, for other architectures) for all sorts of pertinent information on assembly coding on x86-64 chips.  Additionally, using gcc -S blah.c will generate the assembly for your program instead of an executable.  I found this very insightful.


Here's some tips.

There are two major portions to an assembly file.  The assembly directives and the assembly opcodes.  The directives tell the assembler different things that need to be done and the opcodes are what is physically run by the CPU.  So directives will include things like .long to emit integer values into your executable and .data to indicate that the values you are emitting need to be in the data section of the executable.  The opcodes of course are more of what you expect when you think about assembly.  Move, add, call, etc.  What exactly do you want the CPU doing.


By the way, the assembler directives contain a lot of very complicated features.  It actually includes a bunch of interesting options that seem like they would be very useful if you were actually programming directly in it.  Honestly, I didn't expect to see an almost fully fledged language present when I started looking through what was available.


Beware of the .align directive.  As far as I can tell the behavior of this directive is very dependent on the set of circumstances that it is used in.  Not only is it system dependent, but it is also dependent on what type of executable you are generating.  There are two different options:  .balign and .p2align.  These directives each represent the possible behaviors that you will see with .align except they always have the same behavior.  However, supposedly they are not supported on every assembler.


For calling conventions check out the ABI document.  Reverse engineering this by hand isn't too hard (just gcc -S a function with a bunch of parameters), but the ABI made things nice and explicit.  Also, notice that variable parameter functions need to have the number of floating point parameters specified by placing that number into the %rax register.  


You might notice that some assembly code uses labels like this:

leaq    wocky(%rip), %rdi

And some assembly code looks like this:

leaq    wocky, %rdi

I think this is basically equivalent as far as what's actually going to happen when you run the code.  The difference is that one of them is position independent code and the other is position dependent code.  


Dereferencing an address held in a register is done like this:

(%rbp)

If you need an offset from that address you can do something like:

-4(%rbp)


If you need to load an address to a location, use the leaq assembly instruction (load effective address).

leaq    wocky(%rip), %rdi

This is how you could implement something like the & operator (address, not bitwise and) in the C programming language.


Calling or jumping to a function pointer is done by prefixing the value with a *.  Kind of like the following:

call    *%rax
jmp    *%rax

Or if the address is somewhere on your stack:

call    *-8(%rbp)


Normally your functions will look something like this:
    
    .text
.globl wocky
wocky:
    pushq    %rbp
    movq     %rsp, %rbp
    subq       $N, %rsp
    ;;; code
    leave
    ret

"subq    $N, %rsp", is making room on the stack for the local variables inside of the wocky function.  Obviously, you don't need to do this if you don't have any local variables, but it is also unnecessary if you are optimizing your assembly to have tail calls (and you are in a function that can be optimized to be a tail call ... and actually *is* being optimized to be a tail call) or if your function doesn't call any other functions (no reason to save stack space if no one else is going to be stomping on it).



As for me, I'm probably going to play around with assembly and perhaps implement a toy language or two.  However, I'm not sure that I'm really in the mood to try to play catch up with the likes of Mike Pall (check out luajit ... it's kind of frightening what he's doing with it) or the pypy team or the JVM team.  A technology like LLVM actually sounds really attractive to me because I can get low level implementation details (they support tail calls!) on multiple platforms and the whole thing is supported by people smarter and more dedicated than me.  So I get can better, faster, more correct, without having to spend a *lot* of time focusing on hardware details.

Either way though, I feel that poking around in assembly is still useful even in our modern era of managed languages and ultra IDEs.  I think understanding how the universe in which we work functions helps us to avoid certain mistakes and fully utilise the capabilities available to us.

No comments:

Post a Comment