Explore the Intricacies Involved in Compiling C/C++ with GCC

There many C/C++ compilers available on the market. Many of them are free. GCC, or GNU Compiler Collection, is one of such compilers commonly found in a Linux system (Windows and other platforms). It is a product of the Open Source movement pioneered by Richard Stallman. However, GCC per se is not a single compiler, but an integrated collection of compilers for several major programming languages such as  Objective-C, FORTRAN, Java, Ada, and so forth, including C and C++. Over the years, GCC matured to become a reliable tool for every C/C++ programmer. GCC is versatile in the sense that programmers have extensive options to manipulate the compilation process through commands. This article shall delve into some of the key aspects of the compilation process with GCC. But, before that, let's get into a bit of history and refresh our understanding of compilation and the linking process in general.

Glimpse of the Past

It all started with the GNU Project (1984), pioneered by Richard Stallman, software freedom activist. He wanted to create an OS similar to Unix which was not only free but also open source, because back then Unix was not free. Creating an OS like Unix requires a C compiler that was also not available for free. So, the decision was made to start from scratch. Something like: You want to build a house and find that there is no tool available. So, what you do is create the tool first, and then build the house. It's kind of a similar story to the creation of Unix in Bell Lab repeats. Massive work, no doubt, but a successful one: the first version of GCC released in 1987. And, in 1992, from GCC 2.0 revision C++ got incorporated into the GCC family. Since then, GCC has become an important tool for every C/C++ programmer.

Note: C developed prior to C++; as a result, it basically became a subset of C++. C is developed by Dennis Ritchie and C++ by Bjarne Stroustrup. Here, whenever we talk of C/C++, we mean both. In a nutshell, their basic difference lies in the philosophy or coding technique used in programming. C is procedural and C++ is Object Oriented. It is not that you cannot implement Object Oriented features with C or a procedural programming with C++. But, their inherent strength lies in the principles of language design. So, C++ is better for Object Oriented Programming, whereas C is not.

Compilation and Linking Process

Every C/C++ program goes through a compilation and linking process to become an executable. The C/C++ source code written is nothing but mere texts, and the files .c or .cpp are simple text files. These files are given to the compiler which transforms these files into executables, or binaries. Compilation is a complicated process and is a combination of several components such as a parser, semantic analyzer, preprocessor, linker, and so on. But, here we discuss a compiler as a gross process ana d linker as another gross process. Let's run through a simple program and see the story behind the scene.

C program file name: prog1.c

 1. #include<stdio.h>
 2. int main()
 3. {
 4.    printf("Show Me!");
 5.    return 0;
 6. }

C++ program file name: prog2.cpp

 7. #include<iostream>
 8. int main()
 9. {
10.    std::cout<<"Show Me!";
11.    return 0;
12. }

The printf function is an inbuilt library function declared in the header file stdio.h. As a result, whenever we call printf (or any other library function), the (relevant) header file must be included. The #include directive literally copies the entire header file stdio.h (including many other function declarations that we have not used in the program) into our program before beginning the compilation process. The part of the compiler responsible for doing this initial job is called the preprocessor. So, #include is a preprocessor directive.

Once the entire stdio.h file is available, the compiler can validate the call to the printf function (found at Line 4) one with the declaration found in stdio.h. The header file is also a text file and does not contain the actual implementation/definition of the printf function. There is a difference between declaration and definition. Declaration means a prototype of a function where you get an idea about the function name, number, and type of parameter it takes and the return type. Definition, on the other hand, means a logical expansion of the function or what it actually does; for example, the printf function is for displaying in the standard output, aka monitor and its prototype is:

int printf(const char *format, ...);

However, the actual implementation or definition of the printf function is written in the C standard I/O Library. Now, to run the printf function and create a machine instruction for it, the definition must also be linked into the code. But, the C standard I/O Library is an external element (external to the example program above). So, what the compiler does is that it puts an unresolved notation, something like "call to printf," during the compilation phase and lets some other program resolve the issue.

There is separate program called Linker. Linker takes up the task to solve code marked as unresolved, such as "call to printf," and try of find the definition in the available libraries. Linker basically does the patchwork or binds the user's program with the external libraries. The C standard I/O library is somewhat different, though, in this context because it is one of the most used libraries. As a result, most C implementation includes it in the runtime library. Linker can easily and very quickly resolve a call to any function of the Runtime Library. Otherwise, for any other external library, Linker does the heavy duty of knitting scattered implementation into one whole set of machine instruction.

If you are aware of compilation and linking, you must also be aware of another software element called Loader. A discussion on loader is a different story. Suffice to say, it is responsible for loading the binary program into main memory and making way for actual execution, and a whole lot of other stuff.

Command and Control

GCC generally assumes a file name extension .c as a C source code file and .cc, .cpp, CPP, c++, .C, or .cxx as a C++ source code file. Given a source file, GCC does the preprocessing, compilation, assembly, and linking. It provides many options to control the process. There is an another compiler callled g++ that comes with gcc. The g++ compiler assumes every source file given to it is a C++ file regardless of the extension and indiscriminately treats it as a program that adheres to C++ principles.

Controlling Compilation

    • To compile and assemble a source file but not link. This will result in a object file with an extension such as prog1.o
gcc -c prog1.c
    • To compile but not assemble. This results in a assembler code file with an extension such as prog1.s. For example, after running the command the assembler file created is as follows.
gcc -S prog1.c -o prog1.s
prog1.c prog1.s
#include<stdio.h>
int main()
{
   printf("Show   Me!");
   return   0;
}
   .file      "prog1.c"
   .section   .rodata
.LC0:
   .string    "Show Me!"
   .text
   .globl     main
   .type      main, @function
main:
.LFB0:
   .cfi_startproc
   pushq     %rbp
   .cfi_def_cfa_offset 16
   .cfi_offset 6, -16
   movq      %rsp, %rbp
   .cfi_def_cfa_register 6
   movl      $.LC0, %edi
   movl      $0,    %eax
   call      printf
   movl      $0,    %eax
   popq      %rbp
   .cfi_def_cfa  7, 8
   ret
   .cfi_endproc
.LFE0:
   .size     main, .-main
   .ident    "GCC: (Ubuntu 4.8.4-2
      ubuntu1~14.04.3) 4.8.4"
   .section .note.GNU-stack,"",@progbits

If you are interested in seeing the assembly code of a C program, this command comes in quite handy.

  • If you want to see the source code just after preprocessing, similar to assembly, the command is
    gcc -E prog1.c | less

    The output is generally quite large and sent to standard output (monitor) by default. However, it can be redirected to a file as follows:

    gcc -E prog1.c > prog1.pre
  • If you want to place the output in a file, regardless of the type of output, it be it an assembly file, preprocessor file, or an executable file. The command is:
    gcc -v prog1.c -o prog1

    Creates an executable file named prog1. -v denotes verbose, that means steps of the compilation will also be shown.

    gcc -v -S -prog1.c -o prog1.s

    Creates an assembly file named prog1.s

    gcc -v -E prog1.c -o prog1.pre

    Creates an preprocessor file named prog1.pre

  • If you want to control the dialect of the program, such as ANSI standard known as ISO90 or C11 or C99  standard, the commands to compile are as follows:.
    gcc -ansi prog1.c -o prog1
    gcc -std="c11" prog1.c -o prog1
    
  • To enable all warning messages, including warning messages that are not checked by -Wallas well, the command is as follows:
    gcc -Wall -Wextra -std="c11" prog1.c -o prog1

Controlling Preprocessing

Preprocessing begins before compilation, as soon as the preprocessor directive is found with a '#' sign in the code. There are options to control how the preprocessor will behave. For example, you can predefine a macro symbol name along with gcc command.

#include<stdio.h>
int main()
{
   printf("The value of MAX is %d", MAX);
   return 0;
}

Observe that MAX is not defined in the code. The preceding code can be compiled as:

gcc -D MAX=10 prog1.c -o prog1

This is equivalent to writing #define MAX 10 in the source code.

Now, if you want to cancel a built-in or previously defined name or are provided with -D option, it can be canceled as:

gcc -U MAX prog1.c -o prog1

Controlling Linking and Directory Search

In the linking process, gcc looks for all the unresolved symbols in external/internal library files. This can be controlled to direct the linker to search or specify external libraries as follows:

Searches for library named, for example, -lpthread for POSIX thread with

gcc thread_prog.c -o thread_prog  -lpthread

Searches for Xlib graphics routine library

gcc graphics_prog.c -o graphics_prog  -lX11

If you want to control gcc's search path for header and library files in the course of compilation, it my be written as:

gcc -v prog1.c -o prog1 -I /usr/share/myheader_files -L
   /user/share/mylibrary_files

gcc will now search for header files and library files in the mentioned path.

Some Other Frequently Used Options

To compile a single C/C++ file with symbolic information to gdb and warning message, the command is:

gcc -g -Wall prog1.c -o prog1

To compile a single C++ file with code optimized to a Linux machine, the command is:

g++ -O -Wall prog2.cpp -o prog2

In the case of compiling multiple source files into a single executables:

g++ prog2.cpp prog3.cpp prog4.cpp -o myapp

Or, it can be done separately as follows:

g++ -c prog2.cpp
g++ -c prog3.cpp
g++ -c prog4.cpp
g++ prog2.o prog3.o prog4.o -o myapp

A comprehensive list of information of all available options with gcc or g++ can be obtained by the command:

man gcc

or

man g++

A minimal information can be obtained by

gcc --help

or

g++ --help

Conclusion

This article is not a comprehensive guideline for using gcc, yet it provides basic compilation parameters with their descriptions. The gcc is quite versatile when it comes to the manipulation of the compilation process. The gcc manual that comes with Linux is exhaustive and is often very easy to get lost, especially by novice programmers. In the course of programming, you'll come into a situation where a simple and default parameter setting is not good enough. One must learn how to work with gcc sooner than later. The article provides the options that are frequently used and, as you grow as a programmer, the gcc manual is your ultimate guide. Happy coding!



About the Author

Manoj Debnath

manojdebnath@fastmail.fm

Related Articles

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date