why does C++, DevStudio, array indices work?

-2
#include <windows.h> 
#include <stdio.h>

WCHAR *HiveName[4] = {L"HKCR", L"HKCU", L"HKLM", L"HKU"};

int wmain( INT argc, WCHAR **argv )
{
    for ( DWORD i = 0x80000000; i < 0x80000004; i++ )
        wprintf(L"%lu %s\n", i, HiveName[i]);
    return 0;
}

Output:

2147483648 HKCR

2147483649 HKCU

2147483650 HKLM

2147483651 HKU

Why does it work?

c++
arrays
asked on Stack Overflow Apr 26, 2018 by Vince_Fatica • edited Apr 26, 2018 by ZarNi Myo Sett Win

1 Answer

0

First, as Some Programmer Dude says, out of bounds array indexing is undefined behavior. This means that according to the ISO C++ standard, the compiler is allowed to emit anything at all. The compiler could even emit a virus that encrypts your hard drive, and it would still be a standards compliant compiler.

That having been said, I have some speculation about what may be happening.

On Windows, x86 user space processes can use virtual addresses from 0x00000000 to 0x7fffffff. 0x80000000 and above is reserved for the kernel by default, although there are ways to increase this as high as 3 GB. In any case, it seems the limit on any particular allocation is 2 GB, so there's absolutely no way that an index of 0x80000000 or higher could point to a validly allocated object. The compiler is then free to emit code on the assumption that i must somehow be less than 0x80000000.

In this case there may not be any real "optimization." One version of the MSVC compiler outputs the following for the array indexing operation:

push    DWORD PTR wchar_t ** HiveName[esi*4]

Here, esi contains the index, i. It gets multiplied by 4, which is sizeof(wchar_t*), the array element. This overflows, and it happens to always give the right answer because the most significant bit always gets thrown out.

answered on Stack Overflow Apr 26, 2018 by Jack C.

User contributions licensed under CC BY-SA 3.0