The C-ompilation Process
Compiler converts a C program into an executable, this is a multi-stage process and we are going to analize it with a powerful tool called GCC “GNU Compiler Collection”. GCC is an integrated distribution of compilers for several major programming languages. These languages currently include C, C++, Objective-C, Objective-C++, Fortran, Ada, D, Go, and BRIG.
What goes inside the compilation process?
There are four phases for a C program to become an executable:
Let us explain what is happening behing the scenes every time we use the command — gcc — in a .c file.
First, we are going to create a C file using a text editor such as VIM or Emacs and will save it as hello_world.c
#include <stdio.h>int main(void)
printf("Hello, world\n"); return (0);
In order to obtain all intermediate files, we’ll use the command below:
$gcc –Wall –save-temps hello_world.c –o hello_world
Then four new files are generated:
These files are generated as result of each stage in the compilation process, let’s analize it!
This is the first phase through which source code is passed. This phase include:
- Removal of Comments
- Expansion of Macros
- Expansion of the included files.
- Conditional compilation
The preprocessed output is stored in the —
hello_world.i— . Let’s see what’s inside filename.i: using
We can see that the preprocessor will produce the contents of the
stdio.h header file joined with the contents of our
hello_world.c file, stripped free from its leading comment.
The second step is compile the
hello_world.ifile to produce another intermediate output called —
hello_world.s — . In this stage the preprocesed code is translated to assembly instructions, in this stage we still can understand it because is si an intermediate human readeable lenguaje.
This is what we can see from
hello_world.susing an text editor such as Emacs.
The text editor shows that it is in assembly language, which assembler can understand.
This is the third stage, an assembler is used to translate the assembly instructions to object code. The output consists of actual instructions to be run by the target processor. At this phase, the code is converted into machine language. Let’s view this file using $emacs hello_world.o
As we can see, this is totally unreadable for us.
This is the last stage in the compilation process, here all the linking calls with their definitions are done. The linker will arrange the pieces of object code so that functions in some pieces can successfully call functions in other ones. It will also add pieces containing the instructions for library functions used by the program. In the case of the “Hello, world” program, the linker will add the object code for the
As result of this stage (and the other three stages) is an executable program. Is this case the final file is named hello_world. When run without the -o option the file will be named a.out as default, so consider going through the manual before using the gcc command.