Function Calls, Part 2 (Stack and Calling Conventions)

Introduction

In Part 1 of the article series, you looked at the basics of a function call from the perspective of the generated code. In the process, you got familiar with two registers used by the processor in the x86 family—the EIP (instruction pointer) and the ESP (stack pointer). In this part, you will take a little more detailed look into stack and its relevance in the context of function calls. In the process, you will explore two most commonly used calling conventions in Windows/C programming.

Stack. What Is It?

Stack is a memory area (in the 4GB process space) dedicated for a thread where in it can store data that it will need for execution. The data could be things like local variables used by your code, temporary variables used by the compiler, function arguments, and such. The stack is called so because its behavior is like a stack of cards, as far as the processor goes. This means, when you stack items on it, it always goes on the top, and when you remove items from it, you always remove from the top. In technical terms, it is a LIFO (Last In First Out) data structure.

As explained in Part 1, a dedicated stack is made available to each thread by the system. The size of the stack defaults to 1MB, but it can be overridden by the process' image header value if there is one. It can also be overridden by specifying a value explicitly in a call to CreateThread or _beginthreadex APIs.

At any time, the processor needs to know where the top of the stack is. This top of the stack is what the ESP register points to. Unlike the EIP register (which cannot be explicitly modified), the ESP register can be implicitly modified by the processor, or explicitly modified by instructions.

For example,

  • A push instruction implicitly decrements the ESP value by 4 and pushes a 32-bit value specified at that location. It is like putting a card on the top of a stack of cards. The stack is now grown by 4 bytes.
  • The pop instruction implicitly retrieves 32 bits at the current ESP location onto the target and then increments the value of ESP by 4. Using the card analogy, it is like removing a card from the top of the stack. The stack has now shrunk by 4 bytes.
  • Instructions such as mov ESP, [source], or sub esp, [value] explicitly increase/decrease/modify the ESP value, effectively changing the top of the stack location. These are explicit because the ESP register is specified in the instruction as the target.

How Are Parameters Passed to a Function?

Simple answer. Via the stack. The caller of the function knows what to pass; the caller knows how many parameters to pass. So, if a caller calls a function sum that takes two ints as parameters like below,

int sum(int argument1, int argument2)

the caller, in broad terms, does this:

  • It pushes the two parameters to the stack via two push instructions. As a result of this, the stack pointer (ESP) is now implicitly decremented by 2 * 4 bytes; in other words, the top of the stack has been raised by 8 bytes.
  • It issues a call instruction to the function's address. As a result of this, the stack pointer (ESP) is now implicitly decremented by another 4 bytes because the act of issuing a call instruction implicitly pushes the location of the instruction following the call onto the stack.

How Does the Function Retrieve the Parameters?

Simple answer again. Via the stack. Once inside the function sum, before executing any instruction, this is what it the stack will look like. The current ESP value (top of the stack) will be pointing to the location where to continue execution after the sum function returns. If you go down the stack by 4 bytes (in other words, contents at ESP + 4), there you will find one of the parameters to the sum function. Go down another 4 bytes (in other words, contents at ESP + 8); there you will find the other parameter to the function.

As you see, once inside the function, the code has all that it needs to execute.

Calling Conventions

In this context, I need to talk about calling conventions. A calling convention is the protocol for passing arguments to functions. It is an agreement between the caller and the callee. What you just saw in the "How Are Parameters Passed?" and "How Does the Function Retrieve the Parameters?" sectionss is that protocol. That was a general view. But, if you are working in Microsoft land, there are specific calling conventions. Most notable of those are:

  • __cdecl
  • __stdcalll
  • thiscall

In this part, you will dig into the mechanism of the __cdecl and _stdcall calling conventions and learn how the stack and the compiled code look in each case. This will be all hands-on.

Function Calls, Part 2 (Stack and Calling Conventions)

__cdecl

The details on the __cdecl calling convention can be found here. The ones relevant to the dissection are:

  • Argument-passing order: Right to left
  • Stack-maintenance responsibility: The calling function pops the arguments from the stack

Argument Passing order in __cdecl

Argument-passing order refers to the way the arguments are pushed on the stack by the caller. In case of __cdecl, it is right to left; the last parameter is pushed first, followed by the last but one, and so on until the first parameter is pushed. Once this is done, the call instruction to the actual function call is issued.

From the stack view perspective, it means this: If you take a look at the stack first thing inside the callee before any instruction is executed, the 4 bytes at ESP represent the return location, 4 bytes at (ESP + 4) refers to the first parameter, 4 bytes at (ESP + 8) refers to the second parameter and so on.

  • Fire up Visual Studio 2005. Choose Win32 as the project type and choose the Win32 Console Application template. Enter a name, say sum. Click finish to create the project.
  • Press Alt F7 to invoke project properties. Now, to make learning easy, turn off certain settings that cause the compiler to emit some code that will make it harder to understand the core concepts for this article. I will try to address the implications of these settings at a future time.
    • Go to Configuration Properties->C/C++->Advanced. Here, set Calling Convention to __cdecl.
    • Go to Configuration Properties->C/C++->General. Here, set Debug Information Format to Program Database(/Zi).
    • Go to Configuration Properties->C/C++->Code Generation. Here, set Basic Runtime Checks to Default.
    • Go to Configuration Properties->Linker->General. Here, set Enable Incremental Linking to No.
    • Click OK.
  • Modify the code as shown below and put a breakpoint on line 13 (place a caret on line 13 and press F9 to put in a breakpoint):
  • [step1.png]

  • Press F7 to do a build.
  • Press F5 to start debugging. The program execution stops at line 13 now.
  • Press Alt+5. This now brings up the registers window.
  • Press Alt+6. This brings up a memory watch window.
  • Place the caret on line 13, right-click, and choose go to disassembly.
  • In the disassembly view, right-click again and make sure you have the following checked:
    • Show Address
    • Show Source Code
    • Show Code Bytes
    The disassembly is like what you see below:

    [step2.png]

    Note the argument passing. As the calling convention used is __cdecl, you see that parameters are in fact pushed in the right to left order.
  • Immediately following this is the call instruction.
  • Note the value of ESP. Now, execute the first push instructions (in other words, push 2) using F10. Note that the ESP value has decremented by 4.
  • Press F10 again and note that the ESP value is further decremented by 4.
  • Now, press F11 to enter the callee (sum). The disassembly is like below:
  • [step3.png]

  • If you now type in ESP into the Memory address area, you see that the ESP + 4 location in fact has the Arg1 and ESP + 8 has the Arg2. In other words, the right most argument to the function is found at the highest memory address.
  • Press F5 to finish debugging.

Stack-maintenance responsibility in __cdecl

Per specifications, under the __cdecl calling convention, it is the caller's responsibility to maintain the stack. Exactly what does this maintenance mean? In simple terms, what it means is this. You know that before calling the function, the caller pushes things to the stack to pass the arguments to the function. The effect of this is that the stack has grown as a result of calling a function. Now, when the function has finished and returned, the stack has to be restored to its previous value so that, when you now reference variables and arguments in function _tmain, they are present at the right offsets again. This act of bringing back the ESP value to what it was before after function returns is the stack maintenance. You will understand this in the context of the example.

  • Press F5 to start debugging. The program execution stops at line 13 now.
  • Press Alt+5. This now brings up the registers window.
  • Press Alt+6. This brings up a memory watch window.
  • Place a caret on line 13, right-click, and choose go to disassembly.
  • In the disassembly view, right-click again and make sure you have the following checked:
    • Show Address
    • Show Source Code
    • Show Code Bytes
    The disassembly is like below:

    [step4.png]

    Note the ESP value at this time (you haven't started pushing arguments yet). For me, it is 0x12FF64. Right after the call instruction, the first thing that has to happen is to restore the ESP value to this value you noted.
  • Press F10 to step through the two pushes and the call instructions and stop. Observe the value of ESP now.
  • [step5.png]

    In my example it is 12FF5C. Well, it is not the same as what it was before. This means it needs some maintenance.
  • Now, look at the line of code that debugger is currently at. It is
  • add         esp,8
    If you do the math, you see that you get 12FF5Ch + 8h = 12FF64h, the right value. This code is what is exactly the stack maintenance code. It just adds any correction to the ESP so the ESP is now back to where it was. This piece of code is present, not in the callee's code, but in the caller's code; this is exactly what __cdecl mandates—Calling function pops the arguments from the stack. Even though they are not exactly pop instructions, the ultimate end-result is the same; the ESP pointer is now back to where it was.

Implications of __cdecl

  • The stack maintenance responsibility being the caller's means that if the caller calls 100 functions that follow the __cdecl convention from a different area of caller code, it has to have that many stack maintenance code for every one of those calls, even if the same function is being called. This implies a bigger size of generated code.
  • Because the stack maintenance responsibility is the caller's, the __cdecl calling convention allows for a variable number argument list. In case of function calls with variable number arguments, only the caller has the knowledge of how many parameters are being passed. Hence, __cdecl conventions seem fit for this scenario.

Function Calls, Part 2 (Stack and Calling Conventions)

__stdcall

The details on the __stdcall calling convention can be found here. The ones relevant to our dissection are:

  • Argument-passing order: Right to left
  • Stack-maintenance responsibility: Called function pops its own arguments from the stack

Argument Passing order in __stdcall

The argument-passing order refers to the way the arguments are pushed on the stack by the caller. In te case of __stdcall, it is right to left, meaning the last parameter is pushed first, followed by the last but one, and so on until the first parameter is pushed. Once this is done, the call instruction to the actual function call is issued.

From the stack view perspective it means this: If you take a look at the stack first thing inside the callee before any instruction is executed, the 4 bytes at ESP represent the return location, 4 bytes at (ESP + 4) refers to the first parameter, 4 bytes at (ESP + 8) refers to the second parameter and so on.

  • Fire up Visual Studio 2005. Choose Win32 as the project type and choose the Win32 Console Application template. Enter a name, say sum. Click finish to create project.
  • Press Alt F7 to invoke project properties. Now, to make learning easy, turn off certain settings that cause the compiler to emit some code that will make it harder to understand the core concepts for this article. I will try to address the implications of these settings at a future time.
    • Go to Configuration Properties->C/C++->Advanced. Here, set Calling Convention to __stdcall.
    • Go to Configuration Properties->C/C++->General. Here, set Debug Information Format to Program Database(/Zi).
    • Go to Configuration Properties->C/C++->Code Generation. Here, set Basic Runtime Checks to Default.
    • Go to Configuration Properties->Linker->General. Here, set Enable Incremental Linking to No.
    • Hit OK.
  • Modify the code like below and put a breakpoint on line 13 (place a caret on line 13 and hit F9 to put in a breakpoint):
  • [step11.png]

    Do not be stumped by __cdecl specified for _tmain. This is needed because the crt code needs _tmain have the __cdecl calling convention.
    The effect of the configuration setting for Calling Convention applies to those functions for which a calling convention hasn't been explicitly specified. So, sum will use the project setting which is __stdcall.
  • Press F7 to do a build.
  • Press F5 to start debugging. The program execution stops at line 13 now.
  • Press Alt+5. This now brings up the registers window.
  • Press Alt+6. This brings up a memory watch window.
  • Place a caret on line 13, right-click, and choose go to disassembly.
  • In the disassembly view, right-click again and make sure you have the following checked:
    • Show Address
    • Show Source Code
    • Show Code Bytes
    The disassembly is like below:

    [step12.png]

    Note the argument passing. As the calling convention used is __stdcall, you see that parameters are in fact pushed in the right to left order.
  • Immediately following this is the call instruction.
  • Note the value of ESP. Now, execute the first push instructions (in other words, push 2) using F10. Note that the ESP value has decremented by 4.
  • Press F10 again and note that the ESP value is further decremented by 4.
  • Now, press F11 to enter the callee (sum). The disassembly is like below:
  • [step13.png]

  • If you now type ESP into the Memory address area, you see that the ESP + 4 location in fact has the Arg1 and ESP + 8 has the Arg2. In other words, the right most argument to the function is found at the highest memory address.
  • Press F5 to finish debugging.

Stack-maintenance responsibility in __stdcall

Per specifications, under the __stdcall calling convention, it is the callee's responsibility to maintain the stack. Exactly what does this maintenance mean? In simple terms, it means this. You know that before calling the function, the caller pushes things to the stack to pass the arguments to the function. The effect of this is that the stack has grown as a result of calling a function. Now, when the function has finished and returned, the stack has to be restored to its previous value so that, when you now reference variables and arguments in function _tmain, they are present at the right offsets again. This act of bringing back the ESP value to what it was before after function returns is the stack maintenance. You will understand this in the context of the example.

  • Press F5 to start debugging. The program execution stops at line 13 now.
  • Press Alt+5. This now brings up the registers window.
  • Press Alt+6. This brings up a memory watch window.
  • Place a caret on line 13, right-click, and choose go to disassembly.
  • In the disassembly view, right-click again and make sure you have the following checked:
    • Show Address
    • Show Source Code
    • Show Code Bytes
    The disassembly is like below:

    [step14.png]

    Note the ESP value at this time (you haven't started pushing arguments yet). For me, it is 0x12FF64. Right after the call instruction, the first thing that has to happen is to restore the ESP value to this value you noted.
  • Press F10 to step through the two pushes and F10 again to step over the call instruction and stop. Observe the value of ESP now.
    In my example, it is 0x12FF64. Well, it is the same as what it was before. This means the callee (that is, sum) has made sure it is restored to the right value.
  • To look at how this happened, you have to now look at the callee code.
  • Press F5 to finish debugging and press F5 again and step into the call instruction by pressing F11. The disassembly is like below:
  • [step15.png]

  • Single step by pressing F10 all the way and stop at this line:
  • ret 8
  • Look at the value of ESP. This is what I see:
  • [step15.png]

    You see that ESP is pointing to the value 0x0040101d. This is the return value of the function call. Beyond that, there are 8 bytes occupied by the arguments. To restore the stack to 0x12FF64, the ESP has to get advanced by 0x12FF64 - 0x12FF58 (current ESP value) which is equal to 12.
    Now, look at the ret 8 instruction. This what the ret xx instruction does. It first returns to the address pointed to by ESP (0x0040101d in your case), and then it pops xx bytes from the stack. The very act of ret implicitly increments ESP by 4 bytes. This followed by 8 specified for the ret instruction again increments ESP by another 8 bytes, effectively popping 12 bytes. And there you are; you are at ESP = 0x12FF64. The stack maintenance is done. And, the caller did it all, just like the __stdcalll convention mandates.

Implications of __stdcall

  • The stack maintenance responsibility being the callee's means that, if the caller calls a function 100 times from different areas of caller code, the stack maintenance code being the callee's responsibility, it isn't duplicated in 100 places. This implies a smaller size of generated code compared to __cdecl.
  • Because the stack maintenance responsibility is the callee's and the callee has to know exactly how much of stack space is used for arguments, the functions using __stdcall convention cannot have variable number of arguments.

Function Calls, Part 2 (Stack and Calling Conventions)

Side by side comparison of __cdecl and __stdcall

Calling Convention __cdecl __stdcall
Caller code: (1,1)
[callerc.png]
(1,2)
[callers.png]
Callee code: (2,1)
[calleec.png]
(2,2)
[callees.png]

Calling Convention Mismatch Impacts

You have seen the mechanism of calling conventions. Now, it would be interesting to see what impact, if there is one, a calling convention mismatch will have on the thread. In most cases, calling convention mismatches are seldom possible if the caller and callee code are both being compiled in the same executable. This is because the compiler can very well enforce the function convention matching at compile time. However, when the function is implemented in an external module, say a DLL, it is very well possible that a header with the function, say int sum(int a, int b), is being used to compile both the DLL and the caller. If the DLL were to be compiled with a default convention of __stdcall specified for its project settings, AND, if the calling code were to be compiled with a default convention of __cdecl specified, there is a potential of a mismatch because, for the compiler, there is nothing in the function signature which says what calling convention to use.

What would happen in such a case?

Taking these scenarios one at a time....

What happens when the caller assumes a __stdcall convention for a __cdecl function?

The scenario is like the item (1, 2) for caller and the callee implementing like item (2,1) in the comparison chart above. In the proper case, once the code has returned from (2,1), the callee would do the stack maintenance like in (1,1). However, because the caller has wrongly assumed __stdcall, it is not going to any stack maintenance. The effect of this is that the stack pointer is now not properly restored and the stack is bloated when it shouldn't be. Although this may not seem harmful in general, it is a dangerous thing because the stack is growing needlessly and could lead to undesired consequences; for example, stack overflow.

Let me implement a simple example so that you can appreciate the dangers of what such a calling convention mismatch can do. Update the code to something like below:

#include "stdafx.h"

typedef int (__stdcall *SUMFUNC)(int a, int b);

int __cdecl sum(int a, int b)
{
   return a + b;
}

SUMFUNC fn = (SUMFUNC)∑
int __cdecl _tmain(int argc, _TCHAR* argv[])
{
   int nRet = (*fn)(1,2);

   return 0;
}

You have made the code slightly convoluted to force the compiler to cause the mismatch. Note the following:

  • Note, the callee code (sum) is __cdecl. So, it is not going to do stack maintenance.
  • Note, the caller is using a typedef to fake a __stdcall style call into sum. So the compiler will be led to think that the function call is __stdcall and it is not going to put any stack maintenance code either. Perfect for your analysis.

Now, to analyze it.

  • Put a breakpoint on the first line in _tmain and press F5 to stop at that point. Now, look at the disassembly. It will appear like below:
  • [stoc1.png]

    Note the value of ESP. It shows as 12FF64 for me.
  • Now, press F10 three times to go past call instruction. Check the ESP. It show as 12FF5C to me, like below:
  • [stoc2.png]

  • That itself is expected because the called function was a __cdecl-ed one; however, because the dangerous part is that the caller does not have any maintenance code to restore it. (Ideally, it should have been bumped up by 8 bytes to account for the two 4-byte arguments pushed.)
  • For a while, pause here and digest the fact that the ESP has not been restored and it is now bloated by 8 bytes. Got it? Now, consider this situation. If you were to continue to call this function repeatedly, each call is going to lead to a 8 byte creep. When you have a greater number of arguments, the creep is faster. Ultimately, the stack is going to reach its ceiling and the thread will run out of stack space for no reason.
  • Just change the code to call the function in a loop like below. Press F5 to see what I am saying:
  • [stover.png]

  • Change the typedef to use __cdecl like below and see that all is well.
  • typedef int (__cdecl *SUMFUNC)(int a, int b);

What happens when the caller assumes a __cdecl convention for a __stdcall function?

The scenario is like the item (1, 1) for caller and the callee implementing like item (2,2) in the comparison chart above. In the proper case, when the code has returned from (2,2), the callee would do NO stack maintenance. However, because the caller has wrongly assumed the callee to be __cdecl, it is going to do stack maintenance. The effect of this is that the stack pointer is now over corrected by duplicate stack maintenance in both caller and callee. The ESP, therefore, is down by 8 bytes in the example. This means, when another function is called now, the stack space used by the previous function is going to be written over.

Implement a simple example to appreciate the dangers of what such a calling convention mismatch can do. Update the code to something like below:

#include "stdafx.h"

typedef int (__cdecl *SUMFUNC)(int a, int b,int c);

int __stdcall sum(int a, int b,int c)
{
   return ( a + b + c);
}

SUMFUNC fn = (SUMFUNC)∑

int __cdecl _tmain(int argc, _TCHAR* argv[])
{
   (*fn)(1,2,3);
   sum(100,200,300);
   return 0;
}
  • Put a breakpoint on the line (*fn)(1,2,3);. Now, press F5. Stop and note down the value of argc by hovering your mouse over it. It will be 1.
  • Now, execute the next two lines by pressng F10 and stop on line return 0. Again, hover mouse on the argc variable and note the value. Surprisingly, the value has changed to 400 even without a single line of code to do it explicitly.
  • The explanation for this goes like this:
    • When the first line is finished executing ( (*fn)(1,2,3) ), the stack has gone past the actual location it should've been pointing and is now pointing to the location of _tmain function's arguments, and on argc location to be precise.
    • Now, when you execute the next statement ( sum(100,200,300) ), the right most argument, 400, gets pushed on the current stack location, effectively corrupting the argc value.
  • This is called stack corruption. Once in this state, it is hard to say what will happen because all depends on the values of the memory locations and it is just a matter of time before you will start seeing wierd errors.
  • As an exercise, you can see what is going on in disassembly.

Summary

To summarise, this is what you have learned so far.

  • The stack is a dedicated area assigned for a thread to do its work on. It is used as a place to keep function arguments, return addresses, local variables, and convenience variables.
  • There are different calling conventions used. You learned about the mechanics of two popular conventions, __stdcall and __cdecl.
  • Mismatch of calling conventions can lead to subtle, hard to find bugs. You discovered these by looking at examples of code that lead to stack overflow and stack corruption due to wrong assumption of calling convention.

References

Acknowledgements

My sincere thanks to Paul McKenzie for his inputs and guidance with this article series.



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Available On-Demand Today's changing workforce dynamics, economic challenges, and technological advances are placing immense pressure on business leaders to turn their focus on people – their most valuable asset – in order to remain competitive. Research shows that a significant number of new employees quit within one year of taking a new job*. Whether it's through a merger and acquisition, or standard hiring process, like any first impression, early experiences shape their opinions of their new …

  • Do you know where your data is? Consumer cloud-based file sharing services store your sensitive company data on servers outside of your control, outside of your policy and regulatory guidelines – maybe even outside your country – and not managed by you. The potential for data leakage, security breaches, and harm to your business is enormous. Download this white paper to learn about file sync and share alternatives that allow you to manage and protect your sensitive data while integrating and …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds