Function Calls, Part 1 (the Basics)

Introduction

In this article series, I would like to reveal the inner workings of function calls. What happens behind the scenes, why the code generated is the way it is, and reveal other interesting tidbits as you go along. The article will be specific to Microsoft compilers and I'll be using Visual Studio 2005, although the concepts might apply well for prior IDEs. Also, all these apply to systems running Intel x86 architecture CPU.

Prerequisites

  • Basic knowledge of assembly, x86 architecture, and registers is needed. A nice primer can be found here.
  • Basic knowledge of C/C++ programming is needed.
  • Familiarity with Visual Studio 2005 IDE.
  • Enthusiasm and a few mugs of coffee.

Functions, a Simplistic View

I will not explain what a function is. However, I would like to outline the basic structure. A function has a body that contains the code that performs the function (pun unintended). A function can take optional arguments. A function can optionally return something. Although all this seems obvious, it must be noted that when some code wants to use a function, it is basically establishing a contract with the function. It is like saying,

Me: Hey, function sum! I am going to call you and pass you two integers and you give me back the sum of those integers. Okay?

And the function sum will respond like this.

Sum: Okay. I have integer 1 and integer 2. I add them. I have the result. Now, take the result.

What I just described here, in programming jargon, is called the Calling convention. It is the protocol that Me and Sum agree on to communicate with each other. Delving further into this, let me put some technical terms behind this conversation because you will be using those again and again further down.

  • Me will be referred to as the caller. It is the piece of code that makes a call to a function.
  • Sum will be referred to as the callee. It is the piece of code that represents the function itself.
  • Integer 1 and integer 2 will be referred to as arguments.
  • Result will be referred to as the return value.

Functions, a Slightly Detailed View

With the technical terms established, it's time to go a little deeper. A PC understands only the language of the processor (machine code). Hence, any code that you write has to be compiled to that friendly format before the processor can execute. This is done by compiling and linking the code to generate an EXE using your favorite development tool like Visual C++ (to keep matters simple, I'll confine this to EXEs). The generated EXE itself can have more than just executable code. It could have data or resources. The .text section in the EXE image contains the executable code. When an EXE is executed, the operating system does the necessary tasks to start executing the machine code that is compiled and present in the .text section. This is called the main thread. An EXE can spawn other threads of its own. However, the main thread is what governs the lifetime of the execution of the EXE. When the main thread has exited, so does the process.

Now, each thread in an EXE is running an independent code path undisturbed by what is happening with other threads. In essence, each thread has what is called a pointer to the next instruction to execute. In x86 land, this is called EIP register. The EIP simply points to a location in the compiled executable that has an instruction for the CPU. The processor simply keeps loading the instructions at the location pointed to by the EIP and executes it. EIP is not explicitly modifiable, but is updated indirectly when one of the following occurs:

  • The processor has finished executing an instruction. An instruction can have multiple bytes of operation code. However, the processor knows how many bytes each instruction takes and thus is able to advance the EIP by the right amount after each instruction.
  • A call instruction is executed.
  • A ret instruction is executed.

What about the data the code operates on? The data can be data that are local to the function body or are outside it. Those that are outside a function body (global variables andstatic variables), in most cases go to specific section in the executable (.data). Any variables local to function bodies are actually created on a dedicated area called a stack. A stack is a memory area reserved by the operating system per thread. The stack expands and shrinks as functions get called and functions return. It is the place where arguments for a function are stored as well as the local variables.

A simple layout of the EXE in memory and has just one thread running is shown below. It is representative of a simple console app with main and sum functions.

When the exe is executed, the operating system loader maps the EXe (and its dependent DLLs) to a 4GB sandbox. During its lifetime, the EXE does everything within this sandbox, having no effect on other running processes. All that happens within this EXE simply references various locations within this 4GB space. The code itself is an area in this 4GB space. The stack, similarly, is an area. So are resources, dynamically allocated memory, and so forth.

This is what happens when the EXE is executed. The Windows loader loads the EXE and maps it to a 4GB sandbox (green area). It then loads any dependent DLLs, and so on. It creates a main thread, reserves a stack for this thread (purple area), and sets the instruction pointer (EIP) to the location of the entry point within the EXE's .text section. The information for the entry point (the code that starts execution of the EXE) information is available in the EXE's header. This is where program execution begins for the main thread. In a simple case, this is simply the location of (address of) the main function, indicated in figure (although, in reality, this could very well be the CRT startup code that initializes the C runtime for the thread before transferring control to main). There is another register used by the processor that points to the stack. This is called ESP. Before beginning main thread execution, the ESP also is set to point to the bottom of the stack (indicated in figure).

When sum is called and execution is within the sum function, the EIP and ESP pointer values would have gotten updated like below. Note the shift in EIP from main to sum, and the shift of ESP down to accommodate arguments passed into sum.

Function Calls, Part 1 (the Basics)

Functions, a Disassembly View

Enough theory. It's time to see something in action. For the purposes of this article, you will use only the Debug configuration (the reason being, in release configuration, the project settings are such that they optimize code heavily in that it might be difficult to decipher certain pieces of code).

  • Fire up Visual Studio 2005. Choose Win32 as the project type and choose the Win32 Console Application template. Enter a name, say sum. Click finish to create the project.
  • Press Alt+F7 to invoke the project properties. Now, to make learning easy, turn off certain settings that cause the compiler to emit some code that will make it harder to understand the core concepts for this article. I will try to address the implications of these settings at a future time.
    • Go to Configuration Properties->C/C++->General. Here, set Debug Information Format to Program Database(/Zi).
    • Go to Configuration Properties->C/C++->Code Generation. Here, set Basic Runtime Checks to Default.
    • Go to Configuration Properties->Linker->General. Here, set Enable Incremental Linking to No.
    • Click OK.
  • Modify the code like what's shown below and put a breakpoint on line 13 (place caret on line 13 and press F9 to put in a breakpoint):
  • [step1.png]

  • Press F7 to do a build.
  • Press F5 to start debugging. The program execution stops at line 13 now.
  • Press Alt+5. This now brings up the registers window.
  • Press Alt+6. This brings up a memory watch window.
  • Place the caret on line 13, right-click, and choose go to disassembly.
  • In the disassembly view, right-click again and make sure you have the following checked:
    • Show Address
    • Show Source Code
    • Show Code Bytes

Now comes the dissection. Refer to the picture below:

[step2.png]

Points to be noted from here:

  • Remember the EIP (instruction pointer) for the thread I talked about. Note the value of the register in the registers window (circled). It is exactly the address at which the execution is stalled. See how the EIP is pointing to the next instruction.
  • Now, in your debugger, press F10 to execute that instruction. Watch the EIP now; it has been incremented by 2 automatically. Why the magic number 2? It is exactly the number of bytes of machine code the previous instruction used (bytes 6A 02 at 0x00401014). Note that in all this, there was no explicit instruction to update EIP. It was automatic.
  • Step one more time (F10) so you now are pointing to the call 00401000 line. Again note the EIP. It has been incremented by 2 again.
  • Look at the next line of code. A call instruction is basically a call to another location. It transfers the flow of execution to the location specified. In the code above, that location is 0x00401000. Before going further, note the address of the instruction after the call, 0x40101D in the sample above. Once the "call"ed function returns, the program flow has to come back to this.
  • You now are interested in entering into the sum function via the call instruction. To do this, you have to press F11 (stepping into a function). Note, as soon as you executed the call instruction to enter sum, the EIP is updated according.
  • The flow is now inside the sum function. Now, in the memory watch window, type in esp for address. By doing so, you are saying that you are interested in looking at the contents of memory location at the value held by register ESP. When I do so, I get a result like what's shown below:
  • [step3.png]

  • Say that finished executing the sum function; the control somehow has to return to address location 0x40101D. How does this happen? The answer lies in the contents pointed to by ESP. As soon as a call is made to an address, the processor pushes the return address onto the stack automatically. So, if the first thing on entering the function you observe is ESP, the very first DWORD is actually the right address to continue execution when a ret instruction is issued.
  • Now, continue single stepping and stop just before executing the ret instruction (address 0x0040100A in example above). Where do you expect the execution to continue? You guessed it right. If you check the ESP location contents again like before, you should see value 0x40101D if all is well.
  • Press F10 and immediately control is transferred to address 0x0040101D.

Summary

To summarise, this is what you have learned so far.

  • Each thread has its own instruction pointer value that is always kept current. This is represented by the EIP register.
  • Each thread has its own stack for holding function arguments, local variables, return address, and so on. This is represented by the ESP register.
  • Functions are called by issuing call instructions to the processor.
  • To return from a function, the ret instruction is used.
  • A call instruction implicitly does this. It pushes the return location (address of the location following the call instruction) onto the stack (pointed to by ESP). It then updates the EIP to the called location and continues execution from there with the new value of EIP.
  • A ret instruction implicitly does this. It pops the DWORD at location pointed to by ESP into EIP. The control continues from there with the new value of EIP.

Acknowledgements

My sincere thanks to Paul McKenzie for his input and guidance with this article series.



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • With JRebel, developers get to see their code changes immediately, fine-tune their code with incremental changes, debug, explore and deploy their code with ease (both locally and remotely), and ultimately spend more time coding instead of waiting for the dreaded application redeploy to finish. Every time a developer tests a code change it takes minutes to build and deploy the application. JRebel keeps the app server running at all times, so testing is instantaneous and interactive.

  • Hurricane Sandy was one of the most destructive natural disasters that the United States has ever experienced. Read this success story to learn how Datto protected its partners and their customers with proactive business continuity planning, heroic employee efforts, and the right mix of technology and support. With storm surges over 12 feet, winds that exceeded 90 mph, and a diameter spanning more than 900 miles, Sandy resulted in power outages to approximately 7.5 million people, and caused an estimated $50 …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds