XMM register storing

Question

XMM register storing

I am trying to store an XMM register into a certain location such as address line 4534342.

Example:

What I have done?

I know that the xmm registers contain 128 bit values. So, my program has already produced allocated 16 bytes of memory. In addition, since the registers are aligned data, I have allocated 31 bytes and then found an aligned address within it. That should prevent any exceptions from being thrown.

What I am trying to do visually?

Mem Adr | Contents (binary)
4534342 | 0000 0000 0000 0000  ; I want to pass in address 4534342 and the
4534346 | 0000 0000 0000 0000  ; the program inline-assembly will store it 
4534348 | 0000 0000 0000 0000  ; its contents straight down to address 45-
4534350 | 0000 0000 0000 0000  ; 34350
4534352 | 0000 0000 0000 0000
4534354 | 0000 0000 0000 0000
4534356 | 0000 0000 0000 0000
4534358 | 0000 0000 0000 0000

Setup

cyg_uint8 *start_value;      //pointer to the first value of the allocated block                    
cyg_uint32 end_address;      //aligned address location value
cyg_uint32 *aligned_value;   //pointer to the value at the end_address
start_value = xmm_container; //get the pointer to the allocated block
end_address = (((unsigned int) start_value) + 0x0000001f) & 0xFFFFFFF8; //find aligned memory
aligned_value =  ((cyg_uint32*)end_address);  //create a pointer to get the first value of the block

Debug statements BEFORE assembly call to ensure function

printf("aligned_value: %d\n", (cyg_uint32) aligned_value); printf("*aligned_value: %d\n", *aligned_value);

Assembly Call

__asm__("movdqa %%xmm0, %0\n" : "=m"(*aligned_value)); //assembly call

Debug statements AFTER assembly call to ensure function

printf("aligned_value: %d\n", (cyg_uint32) aligned_value); printf("*aligned_value: %d\n", *aligned_value);

The output from printf [FAILURE]

aligned_value: 1661836 //Looks good!

*aligned_value: 0 //Looks good!

aligned_value: -1 //Looks wrong :(

//then program gets stuck

Basically, am I doing this process correctly? Why do you think it is getting stuck?

Thank you for your time and effort.

gcc

assembly

x86

inline-assembly

asked on Stack Overflow Jul 11, 2012 by

Mathew Kurian • edited Jul 12, 2012 by

Mathew Kurian

1 Answer

I don't think your alignment logic is correct if you want a 16-byte aligned address.

Just do the math, it's easy!:

(0 + 0x1f) & 0xFFFFFFF8 = 0x18 ; 0x18-0=0x18 unused bytes, 0x1F-0x18=7 bytes left
(1 + 0x1f) & 0xFFFFFFF8 = 0x20 ; 0x20-1=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(8 + 0x1f) & 0xFFFFFFF8 = 0x20 ; 0x20-8=0x18 unused bytes, 0x1F-0x18=7 bytes left
(9 + 0x1f) & 0xFFFFFFF8 = 0x28 ; 0x28-9=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(0xF + 0x1f) & 0xFFFFFFF8 = 0x28 ; 0x28-0xF=0x19 unused bytes, 0x1F-0x19=6 bytes left
(0x10 + 0x1f) & 0xFFFFFFF8 = 0x28 ; 0x28-0x10=0x18 unused bytes, 0x1F-0x18=7 bytes left
(0x11 + 0x1f) & 0xFFFFFFF8 = 0x30 ; 0x30-0x11=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(0x18 + 0x1f) & 0xFFFFFFF8 = 0x30 ; 0x30-0x18=0x18 unused bytes, 0x1F-0x18=7 bytes left
(0x19 + 0x1f) & 0xFFFFFFF8 = 0x38 ; 0x38-0x19=0x1F unused bytes, 0x1F-0x1F=0 bytes left
...
(0x1F + 0x1f) & 0xFFFFFFF8 = 0x38 ; 0x38-0x1F=0x19 unused bytes, 0x1F-0x19=6 bytes left

First, to get all zeroes in the 4 least significant bits the mask should be 0xFFFFFFF0.

Next, your overflowing the 31-byte buffer if you calculate the aligned address in this way. Your math leaves you with 0 to 7 bytes of space, which isn't sufficient to store 16 bytes.

For correct 16-byte alignment you should write this:

end_address = (((unsigned int)start_value) + 0xF) & 0xFFFFFFF0;

answered on Stack Overflow Jul 11, 2012 by

Alexey Frunze

User contributions licensed under CC BY-SA 3.0