Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IL Instructions not exposed by C#

What IL instructions are not exposed by C#?

I'm referring to instructions like sizeof and cpblk - there's no class or command that executes these instructions (sizeof in C# is computed at compile time, not at runtime AFAIK).

Others?

EDIT: The reason I'm asking this (and hopefully this will make my question a little more valid) is because I'm working on a small library which will provide the functionality of these instructions. sizeof and cpblk are already implemented - I wanted to know what others I may have missed before moving on.

EDIT2: Using Eric's answer, I've compiled a list of instructions:

  • Break
  • Jmp
  • Calli
  • Cpobj
  • Ckfinite
  • Prefix[1-7]
  • Prefixref
  • Endfilter
  • Unaligned
  • Tailcall
  • Cpblk
  • Initblk

There were a number of other instructions which were not included in the list, which I'm separating because they're basically shortcuts for other instructions (compressed to save time and space):

  • Ldarg[0-3]
  • Ldloc[0-3]
  • Stloc[0-3]
  • Ldc_[I4_[M1/S/0-8]/I8/R4/R8]
  • Ldind_[I1/U1/I2/U2/I4/U4/I8/R4/R8]
  • Stind_[I1/I2/I4/I8/R4/R8]
  • Conv_[I1/I2/I4/I8/R4/R8/U4/U8/U2/U1]
  • Conv_Ovf_[I1/I2/I4/I8/U1/U2/U4/U8]
  • Conv_Ovf_[I1/I2/I4/I8/U1/U2/U4/U8]_Un
  • Ldelem_[I1/I2/I4/I8/U1/U2/U4/R4/R8]
  • Stelem_[I1/I2/I4/I8/R4/R8]
like image 856
YellPika Avatar asked Aug 18 '11 16:08

YellPika


Video Answer


2 Answers

I'm referring to instructions like sizeof and cpblk - there's no class or command that executes these instructions (sizeof in C# is computed at compile time, not at runtime AFAIK).

This is incorrect. sizeof(int) will be treated as the compile-time constant 4, of course, but there are plenty of situations (all in unsafe code) where the compiler relies upon the runtime to determine what the memory size of a structure is. Consider, for example, a structure that contains two pointers. It would be of size 8 on a 32 bit machine but 16 on a 64 bit machine. In those circumstances the compiler will generate the sizeof opcode.

Others?

I don't have a list of all the opcodes we don't produce -- I have never had a need to build such a list. However, off the top of my head I can tell you that there is no way to generate an "call indirect" (calli) instruction in C#; we are occasionally asked for that feature as it would improve performance of certain interop scenarios.

UPDATE: I just grepped the source code to produce a list of opcodes we definitely do produce. They are:

add
add_ovf
add_ovf_un
and
arglist
beq
beq_s
bge
bge_s
bge_un
bge_un_s
bgt
bgt_s
bgt_un
bgt_un_s
ble
ble_s
ble_un
ble_un_s
blt
blt_s
blt_un
blt_un_s
bne_un
bne_un_s
box
br
br_s
brfalse
brfalse_s
brtrue
brtrue_s
call
callvirt
castclass
ceq
cgt
cgt_un
clt
clt_un
constrained
conv_i
conv_ovf_i
conv_ovf_i_un
conv_ovf_u
conv_ovf_u_un
conv_r
conv_r_un
conv_u
div
div_un
dup
endfinally
initobj
isinst
ldarg
ldarg_
ldarg_s
ldarga
ldarga_s
ldc_i
ldc_r
ldelem
ldelem_i
ldelem_r
ldelem_ref
ldelem_u
ldelema
ldfld
ldflda
ldftn
ldind_i
ldind_r
ldind_ref
ldind_u
ldlen
ldloc
ldloc_
ldloc_s
ldloca
ldloca_s
ldnull
ldobj
ldsfld
ldsflda
ldstr
ldtoken
ldvirtftn
leave
leave_s
localloc
mkrefany
mul
mul_ovf
mul_ovf_un
neg
newarr
newobj
nop
not
or
pop
readonly
refanytype
refanyval
rem
rem_un
ret
rethrow
shl
shr
shr_un
sizeof
starg
starg_s
stelem
stelem_i
stelem_r
stelem_ref
stfld
stind_i
stind_r
stind_ref
stloc
stloc_s
stobj
stsfld
sub
sub_ovf
sub_ovf_un
switch
throw
unbox_any
volatile
xor

I'm not going to guarantee that that's all of them, but that is certainly most of them. You can then compare that against a list of all the opcodes and see what is missing.

like image 69
Eric Lippert Avatar answered Sep 17 '22 12:09

Eric Lippert


Based on Eric's answer here are some I have spotted. Where I can see a reason I have indicated it, if not I freely speculate. Feel free to indicate if those speculations are wrong.

Break

Signals the Common Language Infrastructure (CLI) to inform the debugger that a break point has been tripped.

You would do this by calling System.Diagnostics.Debugger.Break(), this appears not to use that instruction directly but instead uses a BreakInternal() method baked into the CLR.

Cpblk and Cpobj

Copies a specified number bytes from a source address to a destination address. Copies the value type located at the address of an object (type &, * or native int) to the address of the destination object (type &, * or native int).

I presume these were added for C++/CLI (previously Managed C++), but that is purely speculation on my part. They may also be present in certain system calls but not generated normally by the compiler and provide some scope for unsafe fun and games.

Endfilter

Transfers control from the filter clause of an exception back to the Common Language Infrastructure (CLI) exception handler.

C# doesn't support exception filtering. The VB compiler doubtless makes use of this though.

Initblk

Initializes a specified block of memory at a specific address to a given size and initial value.

I am going to speculate again that this is potentially useful in unsafe code and C++/CLI

Jmp

Exits current method and jumps to specified method.

I will speculate that this sort of trampolining may be useful to those wanting to avoid tail calls. Perhaps the DLR makes use of it?

Tailcall

Performs a postfixed method call instruction such that the current method's stack frame is removed before the actual call instruction is executed.

Discussed in depth elsewhere, currently the c# compiler does not emit this opcode

Unaligned

Indicates that an address currently atop the evaluation stack might not be aligned to the natural size of the immediately following ldind, stind, ldfld, stfld, ldobj, stobj, initblk, or cpblk instruction.

C# (and the CLR) makes quite a few guarantees concerning the aligned nature of much of its resulting code and data. It is not surprising that this is not emitted, but I can see why it would be included.

Unbox

Converts the boxed representation of a value type to its unboxed form.

The c# compiler prefers to use the Unbox_Any instruction exclusively for this purpose. I presume, based on the addition of this to the instruction set in the 2.0 release it makes generics either feasibale, or much simpler. At that point using it throughout the code for everything, generics or otherwise, was either safer, simpler or quicker (or some combination of all).


Footnote:

Prefix1, Prefix2, Prefix3, Prefix4, Prefix5, Prefix6, Prefix7, Prefixref

Infrastructure. This is a reserved instruction.

These are not instructions as such, Some IL instructions are longer than others. These variable length ones should start with prefixes which are never valid on their own to make parsing clear. These prefix opcodes are reserved for that so they are not used elsewhere. Doubtless someone implementing a switch statement based parser for an IL sequence would appreciate these so they could trap those and maintain state.

like image 23
ShuggyCoUk Avatar answered Sep 21 '22 12:09

ShuggyCoUk