C++ Compiling and Linking
The Process
$ g++ -o prog1 prog1.cpp
-
Preprocessor
- Copies contents from included header files into the source code file being compiled
- Replaces symbolic constants using
#define
with their values - Use
-E
option to stop after preprocessing:g++ -E prog1.cpp -o prog.ii
-
Compiled into Assembly
- Expanded source code from preprocessor is compiled into the assembly language for the platform
- Use
-S
option to stop after compiling:g++ -S prog1.cpp
will save toprog1.s
-
Assembler Code into Object Code
- Assembly language source code is compiled into the Object Code (or Machine Code) such as binary
- Use
-c
option to stop after assembly:g++ -c prog1.cpp
will save toprog1.o
-
Object Code Linked
- Object code generated is linked with other Object Code files for any library functions
- Executable is produced
Basic Example
Here is a basic C++ process. Does not do much clearly.
File: prog.cpp
int sum(int a, int b) {
return a + b;
}
int main(int argc, char* argv[]) {
int c = sum(1, 2);
return 0;
}
Preprocess only:
$ g++ -E prog.cpp -o prog.ii
# 1 "prog.cpp"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "prog.cpp"
int sum(int a, int b) {
return a + b;
}
int main(int argc, char* argv[]) {
int c = sum(1, 2);
return 0;
}
To Assembly:
$ g++ -S prog.cpp -o prog.s
.file "prog.cpp"
.text
.globl _Z3sumii
.type _Z3sumii, @function
_Z3sumii:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl -4(%rbp), %edx
movl -8(%rbp), %eax
addl %edx, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size _Z3sumii, .-_Z3sumii
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movl $2, %esi
movl $1, %edi
call _Z3sumii
movl %eax, -4(%rbp)
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609"
.section .note.GNU-stack,"",@progbits
To Object Code:
Viewed with nm
which lists the symbols from object files objfile.
$ g++ -c prog.cpp -o prog.o
$ nm prog.o
0000000000000014 T main
0000000000000000 T _Z3sumii
C++ "mangles" the function symbol names by renaming them to match their format. This is how it accomplishes overloading of a function (having a function with the same name twice with different paramters).
_Z3sumii
is mangled and the "ii" portion corresponds to the int, int
parameters defined.
We can see the de-mangled versions with an addition to nm
.
$ nm -C prog.o
0000000000000014 T main
0000000000000000 T sum(int, int)
To Link and Run:
p Linked and run (note there is nothing really linked nor output in this simple example):
$ g++ prog.cpp -o a.out
$ ./a.out # outputs nothing but does not break so the executable is working
$ nm -C a.out
0000000000601030 B __bss_start
0000000000601030 b completed.7594
0000000000601020 D __data_start
0000000000601020 W data_start
0000000000400410 t deregister_tm_clones
0000000000400490 t __do_global_dtors_aux
0000000000600e18 t __do_global_dtors_aux_fini_array_entry
0000000000601028 D __dso_handle
0000000000600e28 d _DYNAMIC
0000000000601030 D _edata
0000000000601038 B _end
0000000000400594 T _fini
00000000004004b0 t frame_dummy
0000000000600e10 t __frame_dummy_init_array_entry
00000000004006f0 r __FRAME_END__
0000000000601000 d _GLOBAL_OFFSET_TABLE_
w __gmon_start__
00000000004005a4 r __GNU_EH_FRAME_HDR
0000000000400390 T _init
0000000000600e18 t __init_array_end
0000000000600e10 t __init_array_start
00000000004005a0 R _IO_stdin_used
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
0000000000600e20 d __JCR_END__
0000000000600e20 d __JCR_LIST__
w _Jv_RegisterClasses
0000000000400590 T __libc_csu_fini
0000000000400520 T __libc_csu_init
U __libc_start_main@@GLIBC_2.2.5
00000000004004ea T main
0000000000400450 t register_tm_clones
00000000004003e0 T _start
0000000000601030 D __TMC_END__
00000000004004d6 T sum(int, int)