An Introduction to Assembly Language: Part I

Introduction

Most programmers shy away from Assembler (or assembly language). People tend to consider it as a very difficult language to understand and use. Moreover, anyone who knows how to use it is tended to be regarded with some reverence by other programmers in his community.

This tutorial is the first in a series that will attempt to dismiss these misgivings about Assembler and demonstrate that, in actual fact, it's not difficult to use; quite the contrary. It will demonstrate the tools that greatly simplify the authoring of Assembler, and show you how to integrate these with Visual Studio.

First, what is Assembler? Put quite simply, it's the language of the processor. You can't get any lower level than this (except for possibly the actual byte values of the instructions). Every command (mov, add, and so forth) is translated by an Assembler program directly into a number that, in turn, is fed to the processor on execution.

The advantage of using Assembler over other languages is speed. Sheer, raw, unmitigated speed. Even with the modern compiler's ability to optimise code, the code that it produces would have trouble competing with the same written and optimised by hand in Assembler.

Now, Assembler isn't for every job. It is possible to write whole applications in Assembler, but with C++ and the other high languages available today, it'd be masachistic to do so. For the vast majority of applications, the speed of C++ and even the .NET languages is quite acceptible.

Where Assembler comes into its own is when speed is essential; for instance, in graphics applications. For writing small, incredibly fast functions addressing large blocks of memory, Assembler can't be beaten. Bitmap manipulation is a typical example of where a knowledge of how to use Assembler can reap huge rewards.

So, now that I've covered what Assembler is and what its advantages are, how do you use it?

Writing Assembler in Visual C++

The first way that you can write Assembler in C++ is by using __asm blocks:

DWORD Function(DWORD dwValue)
{
   __asm
   {
      mov eax, dwValue
      add eax, 100
      mov dwValue, eax
   }

   return dwValue;
}

Here, you can see a few Assembler instructions enclosed in an __asm block. The C++ compiler will translate these directly into their machine language codes.

Don't worry about what it means. Just know that 'add' is an add instruction and 'mov' is a move instruction that moves values about. The code above just adds 100 to the value passed in.

So, that's it, you might have thought. Well, you'd be wrong. Writing code in __asm blocks is all right for small chunks of Assembler, but if you want to write any functions of length, it starts to become somewhat tedious. For, instance an 'if..else' statement would look something like this:

DWORD Function2(DWORD dwValue1, DWORD dwValue2)
{
   DWORD dwValue3 = 0;

#if 0
   // this is the Assembler
   if (dwValue1 == dwValue2)
   {
      dwValue3 = 1;
   }
   else
   {
      dwValue3 = 2;
   }
#endif

   __asm
   {
      mov eax, dwValue1

      ; this is the test of the values, i.e. if dwValue1 == dwValue2
      cmp eax, dwValue2

      ; jump to 'Else' if they are not equal
      jne Else

      mov eax, 1
      jmp EndIf

   Else:
      mov eax, 2

   EndIf:
      mov dwValue3, eax
   }

   return dwValue3;
}

Again, don't worry too much about the actual instructions, but can you imagine having to do something like this for every if statement?

Using __asm code blocks is like writing C++ using Notepad and the command prompt. Yes, you can do it, but it is a lot more time-consuming than using a dedicated tool for the job.

So, what dedicated tools are there? Well, there's the Microsoft Macro Assembler. Yes, you did hear right. Microsoft has an Assembler. In fact, they've had an Assembler since 1980; it's just that not many people know about it. And what's more, it's freeware. And, even better, it'll produce .obj files that are compatible with the Visual C++ linker. It'll even produce .obj files containing debug information so you can step through your code.

It's called MASM32 and is available at http://www.masm32.com/.

I strongly recommend that you download and install it now before proceeding.

So, what's so good about it?

First, it comes with an extensive set of help files available in the \msasm32\help directory.

Second, it has macros defined for just about every single job you might want to do. Not only that, but if you use the macros, you know you'll always be using the most efficient method of performing each task.

For example, look at the preceding code written using MASM:

Function proc dwValue1:DWORD, dwValue2:DWORD

   mov eax, dwValue1

   .if eax == dwValue2
      mov eax, 1
   .else
      mov eax, 2
   .endif

   ret

Function endp

I think that you'll agree that this is a lot more readable.

So, now that you've downloaded and installed the assembler, I'll take you through a step-by-step guide of adding an Assembler file to an existing C++ project, what to enter into the project settings to compile it, and how to access functions written in Assembler in C++.

An Introduction to Assembly Language: Part I

Adding an Assembler File to Visual C++

This walkthrough is for DevStudio 2002, but equally applies to Visual C++ 6 or DevStudio 2004.

First, you need to include the MASM bin folder in the searched directories for applications.

Go to Tools/Options/Projects/Visual C++ Directories. Add the 'bin' folder, which is under the path where you installed MASM; for instance, if it was installed to C:\MASM32, add C:\MASM32\bin.

[IMAGE3.jpg]

Move this directory to the bottom of the list. This is very imporant. This folder also has a linker application called link.exe that will be used instead of the default linker for Visual Studio if you don't.

Now, create a console application.

Add a file to the project called 'test.asm'. Right-click on the 'source files' folder in Solution Explorer, and go to Add/New Item.

[IMAGE1.jpg]

Enter 'test1.asm', press Return, and a new file called 'test.asm' should have been opened for you.

Enter the following code into the file:

.486
.model flat, stdcall
option casemap :none

.code

TestProc proc dwValue:DWORD

   mov eax, dwValue
   add eax, 100
   ret

TestProc endp

end

This code adds 100 to the input value (dwValue) and returns the result.

Note: All MASM assembler files need to contain the top four lines and end, and then the code goes between the .code and end lines.

Now, you have to configure the build properties for this file. Make sure you're in Debug build and right-click on the file in Solution Explorer and select 'properties'.

Select 'custom build step/general' and enter the following for the 'command line':

ml /c /coff /Zi /Fo"$(OutDir)\$(InputName).asm.obj" "$(InputFileName)"

Now, put the following into the 'Outputs':

$(OutDir)\$(InputName).asm.obj

For example:

[IMAGE2.jpg]

The release build settings are exactly the same except that the compile line doesn't have the /Zi option. This is the 'generate debug info' option, so should be removed for release builds; for example:

ml /c /coff /Fo"$(OutDir)\$(InputName).asm.obj" "$(InputFileName)"

Of course, this can be put into a macro or an addin to automate the process. I have shown the steps involved here so that it can apply equally to all versions of Developer Studio currently in use.

Now, look at how to call functions written in Assembler from C++.

Calling Assembler Code from C++

The above assembler code defines a function called 'TestProc' with an input of a DWORD (in other words, a 32-bit value). You must give C++ a declaration of this function for it to be called. In the .cpp file where your application's Main method exists, put the following at the top, under the #includes:

extern "C" unsigned int __stdcall TestProc(unsigned int dwValue);

Put the following code into the main function

int main(int argc, _TCHAR* argv[])
{
   unsigned int dwValue = 100;
   unsigned int dwReturn = TestProc(dwValue);

   printf("%d\n", dwReturn);
   getchar();

   return 0;
}

When running this application, you will see that the result of 200 is returned, as expected. Not only that, but if you put a breakpoint on the line calling TestProc, you will be able to step into this method (F11) and step through the Assembler. In fact, you can even set breakpoints in the Assembler code.

Conclusion

You have seen how to create an Assembler file, compile it, and call it from C++ code. In the next part of this series, I will start to cover the actual instructions that make up assembly language, and cover subjects such as registers.

Links to Parts II and III

Part II of this introduction can be found here.

Part III of this introduction can be found here.



About the Author

David McClarnon

He first encountered Windows programming using Visual C++/MFC version 1.5 on Windows 3.11 a very long time ago. He is now a contract developer specialising in .NET/native interop with p/invoke.

Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • With JRebel, developers get to see their code changes immediately, fine-tune their code with incremental changes, debug, explore and deploy their code with ease (both locally and remotely), and ultimately spend more time coding instead of waiting for the dreaded application redeploy to finish. Every time a developer tests a code change it takes minutes to build and deploy the application. JRebel keeps the app server running at all times, so testing is instantaneous and interactive.

  • Live Event Date: May 6, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT While you likely have very good reasons for remaining on WinXP after end of support -- an estimated 20-30% of worldwide devices still are -- the bottom line is your security risk is now significant. In the absence of security patches, attackers will certainly turn their attention to this new opportunity. Join Lumension Vice President Paul Zimski in this one-hour webcast to discuss risk and, more importantly, 5 pragmatic risk mitigation techniques …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds