.struct directive of GNU Assembler - How to instantiate a class instance?

0

I'm trying to create an object-oriented class using GNU assembly for educational purposes. I have many questions regarding the use of the .struct directive:

  1. It is said that this directive switch the code to the absolute section. Why is it named .struct then? Does it have anything to do with the struct of C?

  2. What is the difference between using:

.set    Object.data, 8
.set    Object.func_pointer, 16

and using

    .struct 0
Object: 
Object.data:
    .struct Object + 8
Object.func_pointer:
    .struct Object.data + 8
Object_size = . - Object

Is this actually the way to use the .struct directive or it it a miss-use? To be honest I don't even know what I'm doing with the .struct directive. It looks useless to me?

  1. The full code I provided below will allocate one instance of the Object class manually. If I want to instantiate the class whenever I want using a constructor function, do I need to move my object onto the stack and how would that be done? using .bss I can name my instance (for example test_object) but I'm not sure if I can name a stack address.

Full code so you can understand what I'm talking about: (GNU as, Linux x64)

    .data
msg:    .asciz  "Hello world!\n"
.set    msglen, . - msg


# CLASS struct
# This class struct only hold pointers to functions available for the class
    .struct 0
Object: 
# Class variables
Object.data:
    .struct Object + 8
# Class pointers
Object.func_pointer:
    .struct Object.data + 8
Object_size = . - Object

# Allocate Object_size bytes in the bss section to store the object instance
    .bss
test_object:
    .space  Object_size



#----------------------------------------------------------------------------
# Code starts here
    .text
#----------------------------------------------------------------------------

# Simple hello world function used as the target
hellofunc:
    push    %rbp
    mov %rsp, %rbp
    mov $1, %rax
    mov $1, %rdi
    mov $msg, %rsi
    mov $msglen, %rdx
    syscall
    leave
    ret

# Initiate the Object class by:
#   Allocate a chunk of Object_size bytes on the stack
#   Load value given in %rdi into Object.data
#   Load the address of hellofunc into Object.func_pointer
initiate:
    # Prologue
    push    %rbp
    mov %rsp, %rbp
    # Make %rdx point at the current object instance
    mov $test_object, %rdx
    # Load hellofunc address onto test_object.func_pointer
    mov $hellofunc, %rax
    mov %rax, Object.func_pointer(%rdx)
    # Load data from %rdi onto test_object.data
    mov %rdi, Object.data(%rdx)
    # Test call to see if test_object.func_pointer is properly loaded
    call    *Object.func_pointer(%rdx)
    # Epilogue
    leave
    ret

#---------------------------------------------------------------------------
# Main function
    .globl  _start
#---------------------------------------------------------------------------
_start:
    mov $0xDEADBEEF, %rdi
    call    initiate

    # return(0);
    mov $60, %rax
    mov $0, %rdi
    syscall

linux
struct
memory-management
x86-64
gnu-assembler
asked on Stack Overflow May 20, 2021 by stanle

1 Answer

1

I'm pretty sure all you're ever going to get out of any assembly language struct syntax is symbolic names for offsets, and a way to define them using the same .space directives that you'd normally use to reserve space for one static instance of a struct in the BSS. That's what NASM gives you, for example, and GAS's is even more rudimentary than that.

So yes, it's like a struct foo {int x; char c;}; definition in C: you use syntax that looks like declaring (reserving space for) global variables, but instead you're just defining the layout of a struct. (Except in GAS, you can only use .space and similar directives that can't take a value, no distinguishing one int from an array of 4 chars for example.)

But in GAS, it's so primitive that yes, it's just one way to separately set a bunch of assemble-time constants with names like Object.func_pointer. You could do exactly the same thing with Object.func_pointer = 8, as well.


What is the difference between using: (.set) and (.struct)

Nothing. That's all it's doing for you. There isn't anything like a concept of a struct object, it's just one part of the tools that you can use to get symbolic names for offsets. They chose to call it .struct because that's one use-case.

However, you're repeating yourself unnecessarily by using .struct repeatedly (the example in the GAS manual does this, too; I don't know why). AFAIK you do have to repeat the struct name in each label (because GAS .struct syntax isn't very sophisticated), but you can use .space instead of having to name the previous field.

    .struct 0
Object: 
#    .space 4
Object.obdata:
    .space 8
Object.func_pointer:
    .space 8
Object_size = . - Object

.text
mov $Object.obdata, %eax
mov $Object.func_pointer, %ecx

gcc -c foo.s && objdump -drwC -Mintel foo.o

...
   0:   b8 00 00 00 00          mov    eax,0x0
   5:   b9 08 00 00 00          mov    ecx,0x8

The names you choose to define after switching to the absolute section are 100% up to you, and not associated with any overall name for the struct.


NASM's struct support is a bit more sophisticated: you can define a layout in one spot, and use those member names when statically initializing an instance of it (e.g. in the .data section). Nasm - access struct elements by value and by address has an example. Note that it involves endstruc and iend directives, because you are actually defining a struct with members, not just separately setting a bunch of assemble-time constants with names like Object.func_pointer


Anything else like "constructors" you'll have to do yourself. e.g. writing the members into memory somewhere that you've decided is now an instance of a struct object.

This is assembly language, there is no compiler, any instructions you want in the final machine code you're going to have to write explicitly. (Unless you were using MASM, which magically adds instructions to your code in a few cases. GAS certainly doesn't; it was primarily designed to compiler compiler-generated asm. Deciding to emit instructions for a "constructor" at a certain point is something a compiler does internally, not via any special asm syntax. Same for hand-written asm.)

answered on Stack Overflow May 20, 2021 by Peter Cordes • edited May 20, 2021 by Peter Cordes

User contributions licensed under CC BY-SA 3.0