In IL, a label is a name followed by the colon sign i.e ":". It gives us the ability to jump fromone part of the code to another, unconditionally. We have been constantly witnessing thelabels in the il code generated by the disassembler. For e.g.
The words preceding the colon are labels. In the program given below, we have created alabel called a2 in the abc function. The instruction br facilitates the jumping to any label inthe program, whenever desired
The function abc demonstrates this concept. In this function, the code bypasses theinstruction ldc.i4.s 30. Therefore, the return value is displayed as 20, and not 30. Thus, ILuses the br instruction to jump unconditionally to any part of the code. (The assemblyinstruction br takes 4 bytes whereas br followed by .s i.e br.s takes 1 byte, the sameexplanation is applicable for every instruction tagged with .s)The br instruction is one of the key pivots on which IL revolves.
We have initialized the static variable to the value true in our C# program.• Static variables, if they are fields, are initialized in the static constructor .cctor. Thisis shown in the above example.• Local variables, on the other hand, are initialized in the function that they arepresent in.Here, surprisingly, the value 1 is placed on the stack in the static constructor using the ldcinstruction. Even though the field i had been defined to be of type bool in both, C# and IL,there is no sign of true or false values.Next, stsfld is used to initialize the static variable i to the value 1 even though the variableis of the type bool. This proves that IL supports the concept of a data type called bool but,it does not recognise the words true and false. Thus, in IL, bool values are simply aliasesfor the numbers 1 and 0 respectively.The bool operators TRUE and FALSE are artefacts introduced by C# to make the life ofprogrammers easier. Since IL does not support these artefacts directly, it uses the numbers1 and 0 instead.The instruction ldsfld places the value of a static variable on the stack. The brfalseinstruction scans the stack. If it finds the number as 1, it interprets it as TRUE, and if itfinds the number 0, it interprets it as FALSE.In this example, the value it finds on the stack is a 1 or TRUE and hence, it does not jumpto the label IL_0011. On conversion from C# to IL, ildasm replaces the label with a namebeginning with IL_.The instruction brfalse means "jump to the label if FALSE". This differs from br, whichalways results in a jump. Thus, brfalse is called a conditional jump instruction.There is no instruction in IL that provides the functionality of the if statement. The ifstatement of C# gets converted to branch instructions in IL. None of the assemblers thatwe have worked with, support high level concepts like the if construct.It can be appreciated from what we have just learnt that, it is imperative to gain masteryover IL. This will help one to gain the ability to differentiate as to which concepts are a partof IL and which ones have been introduced by the designers of the programminglanguages.It is significant to note that if IL does not support a certain feature, it cannot beimplemented in any .NET programming language. Thus, the importance of familiarisingoneself with the various concepts that IL supports, cannot be over emphasised.
An if-else statement is extremely simple to comprehend(理解) in a programming language, but itis equally baffling in IL. IL checks whether the value on the stack is 1 or 0.• If the value on the stack is 1, as in this case, it calls the WriteLine function with theparameter "hi", and then jumps to the label IL_001d using the unconditional jumpinstruction br.
• If the value on the stack is 0, the code jumps to IL_0013 and the WriteLine functionprints false.Thus, to implement an if-else construct in IL, a conditional and unconditional jump arerequired. The complexity of the IL code increases dramatically if we use multiple if-elsestatements.You can now appreciate the intelligence level of the people who write compilers
The C# programming language can complicate life. In an inner set of braces, we cannotcreate a variable that is already created earlier, in an outer set. The above C# program issyntactically correct since the braces are at the same level.In IL, life is comparatively hassle free. The two i's become two separate variables V_0 andV_1. Thus, IL does not impose any of the restrictions on variables
On seeing the disassembled code, you will comprehend as to why programmers do notwrite IL code for a living. Even a simple while loop gets converted into IL code ofstupendous(惊人的) complexity.For a while construct, unconditionally a jump is made to the label IL_000c which is at theend of the function. Here, it loads the value of the static variable i on the stack.The next instruction, brtrue, does the reverse of what the instruction brfalse does. It isimplemented as follows:• If the uppermost value on the stack, i.e. the value of the field i, is 1, it jumps to labelIL_0002. Then the value "hi" is put on the stack and the WriteLine function is called.• If the stack value is 0, the program will jump to the ret instruction.The above program, as you may have noticed, does not intend to stop. It continues to flowlike a perennial stream of water originating from a gigantic glacier
IL does not have an operator for adding two numbers. The add instruction has to be usedinstead.The add instruction requires the two numbers to be added, to be first made available onthe stack. Therefore, the ldsfld instruction places the value of the static variable i and theconstant value 3 on the stack. The add instruction then adds them up and places theresultant sum on the stack. It also removes the two numbers, that were used in theaddition, from the stack.Most instructions in IL get rid of the parameters that are placed on the stack for theinstruction to operate upon, once the instruction has been executed.The instruction stsfld is used to initialize the static variable i with the resultant sum of theaddition. The rest of the code simply displays the value of the variable i.There is no equivalent for the ++ operator in IL. It gets converted to the instruction ldc.i4.1. In the same vein,to multiply two numbers, the mul instruction is used, to subtract,sub is used and so on. They all have their equivalents in IL. The code following it remainsthe same.
We shall now delve upon how IL handles the conditional operator. Let us consider thestatement j > 16 in C#. IL first pushes the value of j on the stack followed by the constantvalue16. It then calls the operator cgt, which is being introduced for the first time in oursource code. This instruction checks if the first value on the stack is larger than thesecond. If so, it puts the value 1 (TRUE) on the stack, or else it puts the value 0 (FALSE) onthe stack. This value is then stored in the variable i . Using the WritleLine function, a booloutput is produced, hence we see True displayed.In the same vein, the < operator gets converted to the instruction clt, which checks if thefirst value on the stack is smaller than the second. Thus, we can see that IL has its ownset of logical operators to internally handle the basic logical operations
The operator == is the EQUALITY operator It also needs the two operands to be checked forequality, be placed on the stack. It thereafter uses the ceq instruction to check for equality.If they are equal, it places the value 1 (TRUE) on the stack, and if they are not equal, itplaces the value 0 (FALSE) on the stack . The ceq instruction is an integral part of thelogical instruction set of IL
The implementation of the "less than or equal to" (i.e. <= ) and the "greater than or equalto" (i.e. >=)operator is a little more complex. They both actually have 2 conditions rolledinto one.In the case of >=, IL first uses the cgt instruction to check if the first number is greaterthan the second one. If so, it will return the value 1 or else it will return value 0. If the firstcondition is FALSE, the ceq instruction checks for the two numbers to be equal. If so, itreturns a TRUE, or else it returns a FALSE.Let us try to decipher the above IL code from a slightly different perspective. We arecomparing the value 19 with 16. In this case, the instruction cgt will put the value 1 on thestack since 19 is greater than 16. The value 0 is put on the stack using the instruction ldc.The ceq will compare the value 1 returned by the instruction cgt and the value 0 that wasput on the stack by the instruction ldc. Since these two values are not equal, ceq willreturn 0 or FALSE on the stack.Let us change the value of the field j in the static constructor to 1. Now, since the number1 is not greater than 16, the cgt instruction will place the value FALSE or 0 on the stack.Thereafter, another 0 is placed on the stack by the ldc instruction. Now, when theinstruction ceq compares the two values, since they are both 0, it return TRUE
Now, if we change the value of j to 16, the cgt instruction will return a FALSE because 16is not greater than 16. Thereafter, since the value of 0 is placed on the stack by theinstruction ldc, both the values passed to the instruction ceq will be 0. Since a 0 is equalto a 0, the value returned will be 1 or TRUE.If you have not understood the above explanation, remove the lines ldc.i4.0 and ceq fromthe source code and observe the output.
The "not equal to" operator i.e. != is the reverse of ==. It uses two ceq instructions. The firstceq instruction is used to check whether the values on the stack are equal. If they areequal, it returns TRUE; if they are not equal, it returns FALSE.The second ceq compares the result of the earlier ceq with a FALSE. If the result of the firstceq is TRUE, the final answer is FALSE and vice versa.This is truly an ingenious way of negating a value !
We shall now refocus on the while loop after the slight digression into conditionalstatements. This diversion was essential because we use conditional statements in loopssuch as the while loop. A while loop containing a condition is slightly complex.Let us go straight to label IL_0018, which is at the end of the zzz function in IL code. Thecondition is present here. The value of i (i.e. 1) is stored on the stack. Next, the constant 2is placed on the stack.If you revisit the C# code, the condition in the while statement is i <= 2. The instruction ble.s is based on the two instructors, cgt and brfalse. This instruction checks whether the firstvalue, i.e. the variable i, is less than or equal to the second. If so, it instructs the programto jump to the label IL_0002. If not, the program moves to the next instruction.Thus, instructions like ble make our life simpler because we do not have to use theinstructions cgt and brfalse anymore.In C#,the condition of a while construct is present at the top, but the code of the condition,is present at the bottom. On conversion to IL,the code to be executed for the duration ofthe while construct is placed above the code for the condition.
It has been oft repeated that the while and the for constructs provide the samefunctionality, and can be interchanged.In the for loop, the code upto the first semicolon is to be executed only once. Hence, thevariable i that is to be initialised, is placed outside the loop. Then, we unconditionally jumpto label IL_001e to check whether the value of i is less than 2 or not. If TRUE, the codejumps to label IL_0008, which is beginning point of the code of the for statement.The value of i is printed using the WriteLine function. Thereafter, the value of the variable iis increased by one and the condition is checked once again.
The difference between a do while and a while in a C# program lies in the position at whichthe condition gets checked.• In a do while, the condition gets checked at the end of the loop. This means that thecode contained in it will get called at least once.• In a while, the condition is checked at the beginning of the loop. Hence, the code maynever ever get executed.In either case, we place the value 1 on the stack and initialise the variable i or V_1.• In the while loop, we first jump to label IL_000e where the condition checked iswhether the variable is "less than or equal to 2". If TRUE, we jump to Label IL_0004.• In the do while loop, first the Write function is called and then, the rest of the codecontained in the {} braces is executed. On reaching the last line of the code within thebraces, the condition is checked.Thus, it is easier to write a do-while loop in IL than a while loop, since the condition is asimple check at the end of the loop
A break statement facilitates an exit from a for loop, while loop, do-while loop etc.As usual, we jump to the label IL_0014 where the value of variable V_0 or i is placed on thestack. Then, we place the condition value 10 on the stack and check whether i is smaller orlarger than 10, using the instruction ble.s.If it is smaller, we get into the loop at label IL_0004. We again place the value of thevariable i on the stack and place the value 2 of the if statement on the stack. Then, we usethe bne instruction, which is a combination of the ceq and the brfalse instructions.If the variable V_0 is TRUE, the break statement ensures an exit from the loop by jumpingto the ret statement at label IL_0019 using the instruction br.s.
A continue statement takes control to the end of the for loop. When the if statement resultsin true, the program will jump to the end of the loop, bypassing the WriteLine function. Thecode will then resume execution at label IL_0010 where, the value of the variable V_0 isincremented by 1.The main difference between the break and the continue statements is as follows:• In a break statement, the programs jumps out of the loop.• In a continue statement, the program jumps to the end of the loop, bypassing theremaining statements.A goto statement could have also been used to achieve the same functionality. Thus, thebreak, continue or goto statements, on conversion to IL, are transformed into the same brinstruction.The program demonstrates that a goto statement of C# is simply translated into a brinstruction in IL.
A simple goto statement in C# is translated into a br instruction in IL. Using a goto isconsidered inappropriate in languages like C# but, its equivalent br instruction in IL isextensively utilised for implementing various constructs like the if statement, loops etc.Thus, what is taboo in a programming language is extremely useful in IL.
This example illustrates a for statement. We have created a variable j in the function Mainand a variable i in the for statement. This variable i is visible only in the for loop in C#.Thus, this variable has a limited scope.But on conversion to IL, all variables are given the same scope. This is because, theconcept of variable scoping is alien to IL. Therefore, it is upto the C# compiler to enforcethe rules of variable scoping. We can therefore conclude that, all variables have the samescope or visibility in IL.