Building the Right Environment to Support AI, Machine Learning and Deep Learning
In this article series, I would like to reveal the inner workings of function calls. What happens behind the scenes, why the code generated is the way it is, and reveal other interesting tidbits as you go along. The article will be specific to Microsoft compilers and I'll be using Visual Studio 2005, although the concepts might apply well for prior IDEs. Also, all these apply to systems running Intel x86 architecture CPU.
- Basic knowledge of assembly, x86 architecture, and registers is needed. A nice primer can be found here.
- Basic knowledge of C/C++ programming is needed.
- Familiarity with Visual Studio 2005 IDE.
- Enthusiasm and a few mugs of coffee.
Functions, a Simplistic View
I will not explain what a function is. However, I would like to outline the basic structure. A function has a body that contains the code that performs the function (pun unintended). A function can take optional arguments. A function can optionally return something. Although all this seems obvious, it must be noted that when some code wants to use a function, it is basically establishing a contract with the function. It is like saying,
And the function sum will respond like this.
Sum: Okay. I have integer 1 and integer 2. I add them. I have the result. Now, take the result.
What I just described here, in programming jargon, is called the Calling convention. It is the protocol that Me and Sum agree on to communicate with each other. Delving further into this, let me put some technical terms behind this conversation because you will be using those again and again further down.
- Me will be referred to as the caller. It is the piece of code that makes a call to a function.
- Sum will be referred to as the callee. It is the piece of code that represents the function itself.
- Integer 1 and integer 2 will be referred to as arguments.
- Result will be referred to as the return value.
Functions, a Slightly Detailed View
With the technical terms established, it's time to go a little deeper. A PC understands only the language of the processor (machine code). Hence, any code that you write has to be compiled to that friendly format before the processor can execute. This is done by compiling and linking the code to generate an EXE using your favorite development tool like Visual C++ (to keep matters simple, I'll confine this to EXEs). The generated EXE itself can have more than just executable code. It could have data or resources. The .text section in the EXE image contains the executable code. When an EXE is executed, the operating system does the necessary tasks to start executing the machine code that is compiled and present in the .text section. This is called the main thread. An EXE can spawn other threads of its own. However, the main thread is what governs the lifetime of the execution of the EXE. When the main thread has exited, so does the process.
Now, each thread in an EXE is running an independent code path undisturbed by what is happening with other threads. In essence, each thread has what is called a pointer to the next instruction to execute. In x86 land, this is called EIP register. The EIP simply points to a location in the compiled executable that has an instruction for the CPU. The processor simply keeps loading the instructions at the location pointed to by the EIP and executes it. EIP is not explicitly modifiable, but is updated indirectly when one of the following occurs:
- The processor has finished executing an instruction. An instruction can have multiple bytes of operation code. However, the processor knows how many bytes each instruction takes and thus is able to advance the EIP by the right amount after each instruction.
- A call instruction is executed.
- A ret instruction is executed.
What about the data the code operates on? The data can be data that are local to the function body or are outside it. Those that are outside a function body (global variables andstatic variables), in most cases go to specific section in the executable (.data). Any variables local to function bodies are actually created on a dedicated area called a stack. A stack is a memory area reserved by the operating system per thread. The stack expands and shrinks as functions get called and functions return. It is the place where arguments for a function are stored as well as the local variables.
A simple layout of the EXE in memory and has just one thread running is shown below. It is representative of a simple console app with main and sum functions.
When the exe is executed, the operating system loader maps the EXe (and its dependent DLLs) to a 4GB sandbox. During its lifetime, the EXE does everything within this sandbox, having no effect on other running processes. All that happens within this EXE simply references various locations within this 4GB space. The code itself is an area in this 4GB space. The stack, similarly, is an area. So are resources, dynamically allocated memory, and so forth.
This is what happens when the EXE is executed. The Windows loader loads the EXE and maps it to a 4GB sandbox (green area). It then loads any dependent DLLs, and so on. It creates a main thread, reserves a stack for this thread (purple area), and sets the instruction pointer (EIP) to the location of the entry point within the EXE's .text section. The information for the entry point (the code that starts execution of the EXE) information is available in the EXE's header. This is where program execution begins for the main thread. In a simple case, this is simply the location of (address of) the main function, indicated in figure (although, in reality, this could very well be the CRT startup code that initializes the C runtime for the thread before transferring control to main). There is another register used by the processor that points to the stack. This is called ESP. Before beginning main thread execution, the ESP also is set to point to the bottom of the stack (indicated in figure).
When sum is called and execution is within the sum function, the EIP and ESP pointer values would have gotten updated like below. Note the shift in EIP from main to sum, and the shift of ESP down to accommodate arguments passed into sum.