I have the following manually written IL method, which converts an unsigned 32-bit integer to a signed 64-bit integer:
.method public hidebysig static int64 ToInt64(uint32 'value') cil managed
{
.maxstack 2
ldarg.0
ldc.i4.0
brtrue.s return // Never taken.
conv.u8
return:
ret
}
Note that because the value on the top of the evaluation stack is always 0, the conditional branch is never taken.
When I pass the value 4294967295 (the maximum value of uint32
) to the method, it returns -1 and not the expected 4294967295. To me, this suggests that the conv.u8
is being skipped and that a sign extending conversion to int64
is taking place.
However, if I pass the same arguments to a modified method which removes the impossible conditional branch...
.method public hidebysig static int64 ToInt64(uint32 'value') cil managed
{
.maxstack 1
ldarg.0
conv.u8
ret
}
...it returns the expected 4294967295.
What is even more interesting is, if instead of removing the branch, I add a conv.u4
instruction immediately preceding the conv.u8
instruction...
.method public hidebysig static int64 ToInt64(uint32 'value') cil managed
{
.maxstack 2
ldarg.0
ldc.i4.0
brtrue.s return
conv.u4
conv.u8
return:
ret
}
...it also returns 4294967295.
I can't for the life figure out why the inclusion of always-false conditional branch changes the result of the method, nor why a conv.u4
instruction operating on a value that is already a 32-bit integer would affect the execution of the method. To my (limited) understanding of CIL and the CLR, all variations of the method should be valid (but perhaps unverifiable) in the eyes of the CLR and should produce the same result.
Is there some aspect of how the IL is executed that I am missing? Or have I stumbled upon some sort of runtime bug?
One thing I did note is that section III.1.7.5 of ECMA-335 ("Backward branch constraints") states the following:
It shall be possible, with a single forward-pass through the CIL instruction stream for any method, to infer the exact state of the evaluation stack at every instruction (where by “state” we mean the number and type of each item on the evaluation stack).
Given that when the method returns, the evaluation stack could hold either a 32-bit integer or a 64-bit integer depending on whether the branch is taken, could this be taken to imply that the method is invalid CIL and that the runtime is simply doing its best to handle it anyway? Or do the constraints mentioned in this section not apply as this is a forward branch?
Below is a sample console app targeting the netcoreapp2.2
runtime which reproduces the behavior.
.assembly extern System.Runtime
{
.publickeytoken = ( B0 3F 5F 7F 11 D5 0A 3A )
.ver 4:2:1:0
}
.assembly extern System.Console
{
.publickeytoken = ( B0 3F 5F 7F 11 D5 0A 3A )
.ver 4:1:1:0
}
.assembly ConsoleApp1
{
// [assembly: CompilationRelaxations(CompilationRelaxations.NoStringInterning)]
.custom instance void [System.Runtime]System.Runtime.CompilerServices.CompilationRelaxationsAttribute::.ctor(valuetype [System.Runtime]System.Runtime.CompilerServices.CompilationRelaxations) = (
01 00 08 00 00 00 00 00 )
// [assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
.custom instance void [System.Runtime]System.Runtime.CompilerServices.RuntimeCompatibilityAttribute::.ctor() = (
01 00 01 00 54 02 16 57 72 61 70 4e 6f 6e 45 78
63 65 70 74 69 6f 6e 54 68 72 6f 77 73 01 )
// [assembly: Debuggable(DebuggingModes.IgnoreSymbolStoreSequencePoints)]
.custom instance void [System.Runtime]System.Diagnostics.DebuggableAttribute::.ctor(valuetype [System.Runtime]System.Diagnostics.DebuggableAttribute/DebuggingModes) = (
01 00 02 00 00 00 00 00 )
// [assembly: TargetFramework(".NETCoreApp,Version=v2.2", FrameworkDisplayName = "")]
.custom instance void [System.Runtime]System.Runtime.Versioning.TargetFrameworkAttribute::.ctor(string) = (
01 00 18 2e 4e 45 54 43 6f 72 65 41 70 70 2c 56
65 72 73 69 6f 6e 3d 76 32 2e 32 01 00 54 0e 14
46 72 61 6d 65 77 6f 72 6b 44 69 73 70 6c 61 79
4e 61 6d 65 00 )
.hash algorithm 0x00008004 // SHA1
.ver 1:0:0:0
}
.module ConsoleApp1.dll
.imagebase 0x10000000
.file alignment 0x00000200
.stackreserve 0x00100000
.subsystem 0x0003 // IMAGE_SUBSYSTEM_WINDOWS_CUI
.corflags 0x00000001 // COMIMAGE_FLAGS_ILONLY
.class public auto ansi abstract sealed beforefieldinit ConsoleApp1.Program extends [System.Runtime]System.Object
{
.method public hidebysig static int64 ToInt64(uint32 'value') cil managed
{
.maxstack 1
ldarg.0
conv.u8
ret
}
.method public hidebysig static int64 ToInt64_Branch(uint32 'value') cil managed
{
.maxstack 2
ldarg.0
ldc.i4.0
brtrue.s return
conv.u8
return:
ret
}
.method public hidebysig static int64 ToInt64_ConvU4(uint32 'value') cil managed
{
.maxstack 1
ldarg.0
conv.u4
conv.u8
ret
}
.method public hidebysig static int64 ToInt64_Branch_ConvU4(uint32 'value') cil managed
{
.maxstack 2
ldarg.0
ldc.i4.0
brtrue.s return
conv.u4
conv.u8
return:
ret
}
.method public hidebysig static void Main() cil managed
{
.maxstack 2
.entrypoint
ldstr "ToInt64(uint.MaxValue): {0}"
ldc.i4.m1
call int64 ConsoleApp1.Program::ToInt64(uint32)
box [System.Runtime]System.Int64
call void [System.Console]System.Console::WriteLine(string, object)
ldstr "ToInt64_Branch(uint.MaxValue): {0}"
ldc.i4.m1
call int64 ConsoleApp1.Program::ToInt64_Branch(uint32)
box [System.Runtime]System.Int64
call void [System.Console]System.Console::WriteLine(string, object)
ldstr "ToInt64_ConvU4(uint.MaxValue): {0}"
ldc.i4.m1
call int64 ConsoleApp1.Program::ToInt64_ConvU4(uint32)
box [System.Runtime]System.Int64
call void [System.Console]System.Console::WriteLine(string, object)
ldstr "ToInt64_Branch_ConvU4(uint.MaxValue): {0}"
ldc.i4.m1
call int64 ConsoleApp1.Program::ToInt64_Branch_ConvU4(uint32)
box [System.Runtime]System.Int64
call void [System.Console]System.Console::WriteLine(string, object)
ret
}
}
When compiled using Microsoft.NETCore.ILAsm with the Release configuration and run, it outputs the following to the console:
ToInt64(uint.MaxValue): 4294967295
ToInt64_Branch(uint.MaxValue): -1
ToInt64_ConvU4(uint.MaxValue): 4294967295
ToInt64_Branch_ConvU4(uint.MaxValue): 4294967295
You should have checked section III.3.18 (brtrue
):
Verifiable code requires the type-consistency of the stack, locals and arguments for every possible path to the destination instruction
And then:
The operation of CIL sequences that meet the correctness requirements, but which are not verifiable, might violate type safety
As you yourself have observed, you're not ensuring the integrity of the stack on all paths that reach return:
.
User contributions licensed under CC BY-SA 3.0