# C++ Compiling and Linking ## The Process ```shell $ g++ -o prog1 prog1.cpp ``` 1. Preprocessor * Copies contents from included header files into the source code file being compiled * Replaces symbolic constants using `#define` with their values * Use `-E` option to stop after preprocessing: `g++ -E prog1.cpp -o prog.ii` 2. Compiled into Assembly * Expanded source code from preprocessor is compiled into the assembly language for the platform * Use `-S` option to stop after compiling: `g++ -S prog1.cpp` will save to `prog1.s` 3. Assembler Code into Object Code * Assembly language source code is compiled into the Object Code (or Machine Code) such as binary * Use `-c` option to stop after assembly: `g++ -c prog1.cpp` will save to `prog1.o` 4. Object Code Linked * Object code generated is linked with other Object Code files for any library functions * Executable is produced ## Basic Example Here is a basic C++ process. Does not do much clearly. File: `prog.cpp` ```cpp int sum(int a, int b) { return a + b; } int main(int argc, char* argv[]) { int c = sum(1, 2); return 0; } ``` ### Preprocess only: ```shell $ g++ -E prog.cpp -o prog.ii ``` ```cpp # 1 "prog.cpp" # 1 "" # 1 "" # 1 "/usr/include/stdc-predef.h" 1 3 4 # 1 "" 2 # 1 "prog.cpp" int sum(int a, int b) { return a + b; } int main(int argc, char* argv[]) { int c = sum(1, 2); return 0; } ``` ### To Assembly: ```shell $ g++ -S prog.cpp -o prog.s ``` ```text .file "prog.cpp" .text .globl _Z3sumii .type _Z3sumii, @function _Z3sumii: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movl %edi, -4(%rbp) movl %esi, -8(%rbp) movl -4(%rbp), %edx movl -8(%rbp), %eax addl %edx, %eax popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size _Z3sumii, .-_Z3sumii .globl main .type main, @function main: .LFB1: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $32, %rsp movl %edi, -20(%rbp) movq %rsi, -32(%rbp) movl $2, %esi movl $1, %edi call _Z3sumii movl %eax, -4(%rbp) movl $0, %eax leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE1: .size main, .-main .ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609" .section .note.GNU-stack,"",@progbits ``` ### To Object Code: Viewed with `nm` which lists the symbols from object files objfile. ```shell $ g++ -c prog.cpp -o prog.o $ nm prog.o 0000000000000014 T main 0000000000000000 T _Z3sumii ``` C++ "mangles" the function symbol names by renaming them to match their format. This is how it accomplishes overloading of a function (having a function with the same name twice with different paramters). `_Z3sumii` is mangled and the "ii" portion corresponds to the `int, int` parameters defined. We can see the de-mangled versions with an addition to `nm`. ```shell $ nm -C prog.o 0000000000000014 T main 0000000000000000 T sum(int, int) ``` ### To Link and Run: p Linked and run (note there is nothing really linked nor output in this simple example): ```shell $ g++ prog.cpp -o a.out $ ./a.out # outputs nothing but does not break so the executable is working ``` ```shell $ nm -C a.out 0000000000601030 B __bss_start 0000000000601030 b completed.7594 0000000000601020 D __data_start 0000000000601020 W data_start 0000000000400410 t deregister_tm_clones 0000000000400490 t __do_global_dtors_aux 0000000000600e18 t __do_global_dtors_aux_fini_array_entry 0000000000601028 D __dso_handle 0000000000600e28 d _DYNAMIC 0000000000601030 D _edata 0000000000601038 B _end 0000000000400594 T _fini 00000000004004b0 t frame_dummy 0000000000600e10 t __frame_dummy_init_array_entry 00000000004006f0 r __FRAME_END__ 0000000000601000 d _GLOBAL_OFFSET_TABLE_ w __gmon_start__ 00000000004005a4 r __GNU_EH_FRAME_HDR 0000000000400390 T _init 0000000000600e18 t __init_array_end 0000000000600e10 t __init_array_start 00000000004005a0 R _IO_stdin_used w _ITM_deregisterTMCloneTable w _ITM_registerTMCloneTable 0000000000600e20 d __JCR_END__ 0000000000600e20 d __JCR_LIST__ w _Jv_RegisterClasses 0000000000400590 T __libc_csu_fini 0000000000400520 T __libc_csu_init U __libc_start_main@@GLIBC_2.2.5 00000000004004ea T main 0000000000400450 t register_tm_clones 00000000004003e0 T _start 0000000000601030 D __TMC_END__ 00000000004004d6 T sum(int, int) ```