Building the Right Environment to Support AI, Machine Learning and Deep Learning
Introduction and Series Recap
- In Part 1, you looked at the basic structure of code generated surrounding a function call.
- In Part 2, you looked at calling conventions and studied the code generated for two popular calling conventions, __stdcall and __cdecl.
In this article, you will look at local variables and how they come into play.
Typical functions have the following: code, arguments, local variables, and return values. Although the code for the function is compiled in and is unchangeable at runtime, the rest are all dynamic entities and as such are all occupants of the stack for a thread.
- Arguments for the function: As seen in Parts 1 and 2, the arguments to a function are passed in via stack.
- Return location to the caller code: As seen in Parts 1 and 2, the return location is pushed on the stack when a call is executed so when the function does return from its body, the processor knows where to continue code execution.
- Local variables used by the function: You haven't seen this yet, but suffice it to say for now that, this again goes on the stack. The simple reason is that this is a dynamic entity coming into existence and havs meaning only when the function IS called.
- Temporary data used by the compiler: In addition to variables allocated explicitly in code, the compiler itself may need to wriggle room for its temporary variables. For example, the compiler might need to use a particular register before doing some operation. So, it would push the current contents of the register on to the stack for safekeeping, perform the operation, and pop back the stack to restore the register value.
A stack frame is a block of te stack that contains these data required by a function. Each function call results in a new block of stack memory being set aside and prepared for, so that the function can do its job. The stack frame creation is a process in which both caller and callee take part. The caller starts this by pushing the function arguments (note, only the caller knows what to pass in), followed by a call into the function. From this point, the callee code takes over. It further extends the stack frame and prepares the local variables (note, only the callee knows what variables to use). From that point onwards, the stack frame is set up for the function body to operate on.
Each function call results in a creation of a stack frame (with the minimum being the address to return to). So, if funcA calls funcB and funcB calls funcC, three stack frames are set up one on top of the another. When a function returns, its frame becomes invalid. A well-behaved function acts only on its own stack frame and does not trespass on another's. Once it starts going out of its legal bounds, unpredictable and disastrous behavior can occur. It is to be noted that, when funcA calls into say funcB, the stack frame for the funcA is frozen—it doesn't grow or shrink anymore because the control has been passed to funcB (again, if funcB is well behaved and is using only its OWN stack frame).
However, in the context of funcB, the stack frame can be dynamic; it can grow and shrink depending on what is happening inside the function. This means the topmost stack frame is dynamic in nature and it could be contracting and expanding as the function executes. In any case, it can at the least contract to the position in the stack it was in when the function body was just started. If it shrinks any further, it is basically trespassing on the caller established stack frame and the consequences can be bad.
This is pictorially shown below. The scenario is funcA calling into funcB which in turn calls into funcC.
- Block 1 is how the stack frame is when the processor EIP is within the body of funcA. Note the solid nature of the lower boundary, indicating anything below that is forbidden as far as funcA goes. Similarly, note the dotted boundary at the top of the frame, indicating there is a scope for dynamic expansion and contraction of this top edge as the function is executing. For a moment note that, because this is the topmost function that is executing in that thread, any memory upwards is available to do anything (of course, within limits of the stack size for the thread).
- Block 2 represents how the stack frames look like when the processor EIP is within the body of funcB. Note now the stack frame for funcA has been frozen at the state when it made a call into funcB. This is indicated red and has a solid boundary at the top indicating, when funcB does return, the stack pointer MUST be at that point.
- Block 3 represents how the stack frames look when the processor EIP is within the body of funcC. Note now, compared to block 2, the stack frame boundary of funcB is at a lower point than in block 2. This is representative of the fact that, during the execution of func, the compiler might have had code that resulted in a some pushing and some popping of data on the stack. So, at the time funcC was called, it was at a different stack pointer location than it was when block 2 snapshot was taken.
- Blocks 4 and 5 show how the stack frames are invalidated and are restored to their original state when funcC and funcB have just returned respectively.
- In all this, it is to be noted that although the red region indicated as forbidden, is not really forbidden by the compiler or the processor. It is just a boundary that you have to visualize such that, if for some reason there happens to be a trespass, there is a problem with the code.
To summarize, stack frames are set up to facilitate the function bodies. A call to a function is preceded by the caller code preparing the stack to push parameters, followed by the callee code preparing the stack for its local variables. When the callee code returns, it has to make sure the stack pointer is back at the same value when its code was entered in the first place. This makes sure the stack frame is restored.
In Parts 1 and 2, you saw how function arguments are passed. These are passed via the stack. To be precise, the caller pushes these on the stack and then issues a call instruction with the address of function to be called. If the stack contents are checked right before the function code is executed, you see that the top most on the stack is the return address, followed by the function arguments. That was the data from the caller. How about callee data? Specifically, how about the function local variables? These also have a lifetime as good as that for the function body, so they are perfect candidates to be allocated on the stack. This in fact is the case and the compiler DOES allocate local variable space right on the stack. This is one of the first things the callee code generated by the compiler will do. It will allocate enough data to accomodate all local variables within a function body. The following figure shows this setup. Note the division of responsibility of stack frame setup by different parts of the code.
The picture above is a representation of the stack frame setup for a function call. It is interesting to note that the return address location acts as a kind of anchor point for accessing function data. Note that, to access the arguments, the function body will have to traverse down (higher addresses) from the location where the return address is stored, and to access the local variables, the function body will have to traverse up the stack (lower addresses) relative to the location where the return address is stored. In fact, typical compiler generated code for the function will do exactly this. The compiler dedicates a register called EBP for this (Base Pointer). Another name for the same is frame pointer. The compiler typically, as the first thing for the function body, pushes the current EBP value on to the stack and sets the EBP to the current ESP. This means, once this is done, in any part of the function code, argument 1 is EBP+8 away, argument 2 is EBP+12(decimal) away, local variables are EBP-n away.
Call Stack Reconstruction Using Frame Pointers
If Frame pointers, in other words dedicated EBP, are used, there is an interesting side effect. Note that the first thing a compiler-generated code would do on entering a function is to push the current EBP on the stack. This EBP value that was pushed is in fact the frame pointer of the calling function. Once this is pushed, the EBP value is the frame pointer for the called function. So, if you stop execution in a function and check the EBP value, it will be the frame pointer to the function. If you check the contents of the frame pointer location, the contents are actually the frame pointer for the previous function (in other words, the calling function). And if you take the value in that location and check what that is pointing to, it will be the frame pointer of the function that called it and so on. So, by using the EBP values pushed on the stack, you can, traverse through all the frames step by step. Interesting isn't it? So, at any given time, you can trace back and see what all the frames were, which indirectly means, you know what parameters were passed to each of the functions before it stopped execution at where you are. Take some time to digest this.
Now, wouldn't it be cool if you could, in addition to what parameters were passed, know the functions that were called? I mean, what good is knowing the parameters if you don't know what functions they are meant for? Well, it turns out, that information is available too. Referring to the figure above, you see that the caller's return address is at frame pointer+4 bytes away. Bingo. You can construct the whole story now. If you stop execution at any time (say funcC), you know from the current EBP what function called you (because you have the return address to funcB pushed at EBP+4). You also know where the stack frame of the function that called us is (in other words, the stack frame of funcB is at EBP+0). From this, you know the return address that your caller in turn has to return to (in other words, you know that funcB has to return to funcA). From the stack frame of funcB, you know what parameters were passed into funcB and so on. All this is pictorially shown below: