CS50 - Week 3
Asymptotic Notation / Big O Notation
O() defines the upper bound of number of operations
O(n^2)
- bubble sort takes (n^2 - n)2 operations to sort n elements, which is approx n^2O(n)
- finding the biggest number in a list size n takes at most n operationsO(logn)
- binary search on a sorted list takes logn operations (i.e. divide and conquer)O(1)
- constant time - this indicates a constant number of operations independent of n
Ω() defines the lower bound of the number of operations
Ω(n)
- bubble sort takes a minimum of n operations if the list is already sortedΩ(1)
- binary search takes a minimum of 1 operation if we end up at the item on the first attempt
theta() can be used where O() and Ω() are the same.
e.g. selection sort without any kind of optimisation - i.e. keep looking for the smallest each time and move it to the end, even if the list is already sorted the problem still takes approx n^2 steps.
Bogosort (aka stupid sort) shuffles a set of cards, then checks if they are sorted ad infinitum.
Some O(n^2) algorithms may be faster for O(n) algorithms for small data sets, but as the data set increases, O(n^2) will always take longer and can really slow down your program.
GDB
run - r
- run the program / restart the program
break [line/function] - b
[line/function] - set a breakpoint
b main - set a breakpoint on function main
b prog.c:14 - set a breakpoint on line 14
continue - c
- continue and break on next break point
next - n
- step over
step - s
- step into
list - l
- print surrounding code - keep typing to keep showing the program
l 12 - show the program around line 12
print [variable] - p
variable - print a variable
print [variable] = [value] - set a variable e.g. p i = 2 will set i to 2
info locals - prints local vars
display [variable] - displays the var for the duration
- execute previous command
make xxx - make the program (having updated e.g. a source file)
disable - unset all breakpoints
Linear Search
The only way of finding something in an unsorted array.
Binary Search
Binary search searches for an item in a sorted list by first checking the middle element. If this isn't what we are looking for, check if our element is bigger or smaller than this element. If bigger, ignore the left half of the list and repeat the same steps on the right side. If smaller, ignore the right half of the list and repeat the same steps on the left side.
Elements either need to be sorted first OR need to be kept sorted at all times.
Running time is O(logn) [where the base doesn't matter for this problem] and constant time of Ω(1) for the best case scenario where you find the element first time.
Binary search tree
An easy way to keep all elements sorted at all times is to use a binary search tree.
- The left subtree of a node only contains values which are less than or equal to a node's value.
- The right subtree of a node only contains values which are greater than or equal to a node's value.
- Both left and right subnodes are also binary search trees.
Uses
- Find the minimum by going all the way to the left
- Find the maximum by going all the way to the right
- Find any number by going to the left or right according to if the number is less than or greater than the current node
- Print out all elements by performing an in order traversal:
- print everything out in the left subtree
- print out the node itself
- print everything out in the right subtree
- Insert an element by comparing to the current node, going left or right until we get to a leaf node, then add the element either to the left or right sub tree of this node
- Delete an element
- if an element is a leaf node, it can be deleted
- if an element has only one child node we need to connect the child node to the parent of the element before deleting the element
- if an element has two child node first find the node which has the next largest value, swap these two and then delete the node (note that the next largest value node, can't be a parent of two child nodes itself otherwise the left node of this would be a candidate for the next largest)
Considerations
There are many ways to build a binary search tree, some better than others. A decent tactic is just to shuffle the input before building the tree.
Bubble Sort
- Sort elements by comparing first two elements in the list and swapping if not in order, then compare second and third elements in list etc.
- Sort is so called since large elements bubble up to the top quickly
- After n iterations, largest n elements will be sorted - this can be used to optimise the sort so that the last element is not checked after the first iteration, the last 2 elements are not checked after the second iterations etc
- Worst case scenario is that the list is backwards the n-1 iterations are needed
- Best case scenario is that the list is already sorted and only 1 iteration is needed
- Runtime is O(n^2) but fine if you have a small list / want a simple implementation
Insertion Sort
- Sort elements by dividing into two halves, a sorted half and an unsorted half. Gradually move elements from the unsorted half to the sorted half until they are sorted.
- At first all elements are in the unsorted half
- Then take the left most element and make this the start and end of the sorted half
- Take the first element in the unsorted portion and compare to the element in the unsorted portion and insert accordingly
- Now take the new first element and compare to the last element, if smaller, move the last element up. If smaller than the next one too, then also move down and insert.
- Continue until the whole list is sorted, always inserting elements to keep the sorted portion sorted
- Runtime is O(n^2) with best case Ω(n)
Selection Sort
- As with insertion sort we start with a list of unsorted elements and one by one move them to a sorted list until all elements are in the sorted list and therefore sorted
- Instead of taking left most elements and inserting them in the correct place in the sorted half, we instead select the elements from the unsorted half carefully e.g. select the smallest element each time and move this to the sorted half.
- When moving to the sorted half it can simply be appended
- Elements can be moved by swapping the next smallest element with the left most element of the unsorted half
- Run time is O(n^2) with best case also of Ω(n^2) since it still needs to check every element even in a sorted list (i.e. on each pass through the array we only look for the smallest element and move it to its correct position - there is no checking if everything is already in position)
Quick Sort
Tends to be quicker than bubble sort and insertion sort, but not always - still O(n^2) in worst case scenarios, but more typically has a run time of O(n log n).
Take the right most element - this becomes the pivot. We now need to work up the list comparing each element to the pivot, putting elements smaller than the pivot on one side and elements larger than the pivot on the other side. Finally, place the pivot in the middle of these two sides. The pivot is now in the right place. Recursively do the same again for each side of the pivot. If one side only has one element, this side is now sorted.
In best case scenarios the pivot is in the middle of the list of elements and we truely divide and conquer. In worst case scenarios the pivot elements are at one end or the other and we have to constantly be comparing against all elements.
To improve this, we can cut down on the probability that the pivot is high or low by selecting multiple pivots and choosing the median value.
Arrays and functions
Declare a function which takes an array as a parameter using:
void sort(int values[]);
When passing a one-dimensional array as here, we don't need to pass in the array's size.
Call this function using:
int haystack[10];
// .. fill the array
sort(haystack);
Arrays are passed by reference here.