Square root [ASM]


I'm having a bit of a hardship understanding how I could possibly perform this operation.

    float squared( float num )
           push   ebp
           mov    ebp, esp
           sub    esp, num
           xorps  xmm0, xmm0
           movss  dword ptr 4[ebp], xmm0
           movss  xmm0, dword ptr num[ebp]
           mulss  xmm0, dword ptr num[ebp]
           movss  dword ptr 8[ebp], xmm0
           fld    dword ptr 4[ebp]
           sqrtss xmm0, ebp
           movss  ebp, xmm0
           mov    esp,  ebp
           pop    ebp
           ret    0

I've worked in C / C++ for a while now, and it's always been a task of mine to really dig into how inline assembly works, but I'm having some problems when executing the code.

When I run this in my main function to print the root and insert a value, I'm given an error:

Exception thrown at 0x00000000 in Test.exe: 0xC0000005: Access violation executing location 0x00000000. occurred

Any ideas?

asked on Stack Overflow Jan 14, 2020 by thatfox • edited Jan 15, 2020 by Ross Ridge

2 Answers


The most fundamental issue with this code is that you wrote your own function prologue and epilogue. You have to do that when you are writing .ASM files entirely by hand, but you have to not do that when you write "inline" assembly embedded in C. You have to let the compiler handle the stack. This is the most likely reason why the program is crashing. It also means that all of your attempts to access the num argument will instead access some unrelated stack slot, so even if your code didn't crash, it would take a garbage input,

As pointed out in comments on the question, you also have a bunch of nonsensical instructions in there, e.g. sqrtss xmm0, ebp (sqrtss cannot take integer register arguments). This should have caused the compiler to reject the program, but if it instead produced nonsensical machine code, that could also cause a crash.

And (also as pointed out in comments on the question) I'm not sure what mathematical function this code would compute in the hypothetical scenario where each machine instruction does something like what you meant it to do, but it definitely isn't the square root.

Correct MSVC-style inline assembly to implement single-precision floating point square root, using the SSEn sqrtss instruction, would look something like this, I think. (Not tested. Since this is Win32 rather than Win64, an implementation using fsqrt instead might be more appropriate, but I don't know how to do that off the top of my head.)

float square_root(float radicand)
    __asm {
        sqrtss xmm0, radicand

... Or you could just #include <math.h> and use sqrtf and save yourself the trouble.

answered on Stack Overflow Jan 14, 2020 by zwol • edited Jan 15, 2020 by zwol

I think using fsqrt from scratch will work.

fld qword [num]
answered on Stack Overflow Jan 18, 2020 by AAA11112345678

User contributions licensed under CC BY-SA 3.0