Interesting things about ASCII

Reference: https://cdn.cs50.net/2015/x/psets/0/pset0/pset0.html

  • 0 is 0000000 - is the NIL character also used in C as the string termination character (or \0)
  • 127 is 1111111 - the DEL character since when using paper tape if you make a mistake, you can overwrite by punching holes in all places i.e. 1111111
  • The first 30 characters after NIL are control characters
  • 32 is a space followed by several other punctuation characters
  • 48 to 57 are the decimal digits where 48 is 0 and represented by 0110000 in binary so that the last 4 digits follow their binary equiavlent
  • 65 to 90 are the upper case characters where 65 is A and represented by 1000001 in binary
  • 97 to 123 are the lower case characters where 97 is a and represented by 1100001 in binary so that conversion from upper case to lower case can be done by flipping just one bit

Compilers

For clang there are 4 major steps done by the compiler:

  1. Preprossing
  2. Compilation
  3. Assembling
  4. Linking

Preprocessing

The preprocessor handles the preprocessor directives, i.e. lines beginning with a # e.g. #include, #define To run clang to the preprocessing stage only pass the -E flag.

clang -E hello.c > hello2.c

Compilation

The compiler actually compiles the C program down to assembly. To run clang to the compilation stage only pass the -S flag. Assembler code is very processor specific.

clang -S hello2.c       # outputs hello2.s

Assembler

The assembler takes assembly code and outputs machine code. Pass clang the -c flag to only go as far as assembling.

clang -c hello2.s       # or clang -c hello2.c, or clang -c hello.c

This is more of a translation than a compilation and it is relatively easy to translate from assembly to machine code with a translation table.

Linking

Finally linking combines all object files into one file which can be executed. This is very system dependent. By specifying a series of .o files to clang, it won't need to preprocess, compile or assemble these files again. If the source code includes standard library headers e.g. match.h, then this can be linked in using -lm:

clang -o hello2.o -lm

Note all -l flags must come at the end of the command.

Libraries

C libraries are typically distributed with two parts - a header file declaring all the functions in the library, and a binary file containing the implementations of the functions as an executable. Source code is not usually included. To use a library in another program, include a link to the header file in the source code and then link in the executable during compilation.

Header files for standard libraries are usually found in /usr/include and are included using angle brackets e.g. #include <stdio.h>.

Header files for other files are included using quotes e.g. #include "myotherlibrary.h".

Clang

A typical call to build a source file is:

clang -ggdb3 -O0 -std=c99 -Wall -Werror mario.c  -lcs50 -lm -o mario
  • clang is the name of the compiler (other options are gcc)
  • -ggdb3 includes debugging information in the binaries so that it can be debugged by e.g. gdb
  • -Wall ensures all warning messages are shown
  • -Werror exits the compiler if there are any warning messages (default behaviour without this is to still build the code providing there are no errors)
  • mario.c is the name of the source code file to compile
  • -lxxx links in any other libraries required e.g. -lm links in math
  • -o mario specifies the executable should be named mario

Precendence

  • Inner Parentheses Outward
  • i++, i--
  • *x, &x ++i, --i
  • *, /, %
  • +, -