I am trying to write to Extended Control Register 0 (
xcr0) on an x86_64 Debian v7 virtual machine. My approach to doing so is through a kernel module (so
CPL=0) with some inline assembly. However, I keep getting a general protection fault (
#GP) when I try to execute the
init function of my module first checks that the
osxsave bit is set in control register 4 (
cr4). If it isn't, it sets it. Then, I read the
xcr0 register using
xgetbv. This works fine and (in the limited testing I have done) has the value
0b111. I would like to set the
bndcsr bits which are the 3rd and 4th bits (0-indexed), so I do some
ORing and write
0b11111 back to
xsetbv. The code to achieve this last part is as follows.
unsigned long xcr0; /* extended register */ unsigned long bndreg = 0x8; /* 3rd bit in xcr0 */ unsigned long bndcsr = 0x10; /* 4th bit in xcr0 */ /* ... checking cr4 for osxsave and reading xcr0 ... */ if (!(xcr0 & bndreg)) xcr0 |= bndreg; if (!(xcr0 & bndcsr)) xcr0 |= bndcsr; /* ... xcr0 is now 0b11111 ... */ /* * write changes to xcr0; ignore high bits (set them =0) b/c they are reserved */ unsigned long new_xcr0 = ((xcr0) & 0xffffffff); __asm__ volatile ( "mov $0, %%ecx \t\n" // %ecx selects the xcr to write "xor %%rdx, %%rdx \t\n" // set %rdx to zero "xsetbv \t\n" // write from edx:eax into xcr0 : : "a" (new_xcr0) /* input */ : "ecx", "rdx" /* clobbered */ );
By looking at the trace from the general protection fault, I determined that the
xsetbv instruction is the problem. However, if I don't manipulate
xcr0 and just read its value and write it back, things seem to work fine. Looking at the Intel manual and this site, I found various reasons for a
#GP, but none of them seem to match my situation. The reasons are as follows along with my explanation for why they most likely don't apply.
If the current privilege level is not 0 --> I use a kernel module to achieve
If an invalid
xcr is specified in
%ecx --> 0 is in
%ecx which is valid and worked for
If the value in
edx:eax sets bits that are reserved in the
xcr specified by
ecx --> according to the Intel manual and Wikipedia the bits I am setting are not reserved
If an attempt is made to clear bit 0 of
xcr0 --> I printed out
xcr0 before setting it, and it was
If an attempt is made to set
0b10 --> I printed out
xcr0 before setting it, and it was
Thank you in advance for any help discovering why this
#GP is happening.
Peter Cordes was right, it was a problem with my hypervisor. I am using VMWare Fusion for virtualization, and after a lot of digging on the internet I found the following quote from VMWare:
Memory protection extensions (MPX) were introduced in Intel Skylake generation CPUs and provided hardware support for bound checking. This feature will not be supported in Intel CPUs beginning with the Ice Lake generation.
Starting with ESXi 6.7 P02 and ESXi 7.0 GA, in order to minimize disruptions during future upgrades, VMware will no longer expose MPX by default to VMs at power-on. A VM configuration option can be used to continue exposing MPX.
The solution VMWare proposed was to edit the virtual machine's
.vmx file with the following directive.
cpuid.enableMPX = "TRUE"
After I did this, things worked and I was able to use
xsetbv to enable the
bndcsr bits of
When using VMWare to expose CPU features from the host to the guest under more normal conditions (i.e. the feature isn't plagued with deprecation) you can mask the bits of
cpuid leaves by adding the following to the VM's
cpuid.<leaf>.<register> = "<value>"
So, for example, if we assume that SMAP can be exposed this way, we would want to set bit 20 of
cpuid leaf 7.
cpuid.7.ebx = "----:----:---1:----:----:----:----:----"
Colons are optional to ease reading of the string, ones and zeros override any default settings, and dashes are used to leave default setting alone.
/proc/cpuinfoon the VM doesn't list mpx in the flags (it does list xsave though). My host does have MPX support though. I am running Linux kernel version 3.19 which does support MPX and I already have a binary compiled with MPX (the bnd instructions etc. are all there when I objdump). The problem is that the instructions get treated as NOPs. I thought the process I described above would fix this and enable the CPU to recognize MPX.
It would enable MPX if you ran it on a machine that supported MPX. (Assuming your code is correct.)
The virtual x86 CPU your VM is running on does not, according to its own virtualized CPUID, so it's not surprising at all that this faults. The hypervisor might be doing this manually in a VMEXIT, emulating
xsetbv and checking the changes to the virtualized xcr0.
If you want to use features your HW has but your VM doesn't support, in general you have to run on bare metal instead. Or find a different VM that does expose the feature to the guest.
Note that MPX introduces new architectural state (the
bnd registers) that have to get saved/restored on context switches. If your hypervisor doesn't want to do that, that would be one reason to disable MPX. (I think it can get saved/restored as part of
xsave, but it does make the save slightly larger.) I haven't looked at MPX much; it might be something the hypervisor would have to deal with in vmexits to not have bounds checking apply to the hypervisor... If so that would be a major inconvenience.
User contributions licensed under CC BY-SA 3.0