CS50 - Week 6
Arrays
Arrays allow us to store elements of the same data type contiguously in memory. They have random access which is crucial to algorithms like binary search, but their size is fixed.
Linked list
Each node in a linked list allows us to store the element as well as a pointer to the next element. This allows the list to grow and shrink dynamically, but takes longer to access.
Hash Tables
Hash tables have the best of both worlds. Hash tables are arrays coupled with a function, the hash function. This function takes a piece of data, the key and outputs an integer, or hash value. This hash value maps the key to an index in the hash table. This determines where in the hash table to store and search for a key.
A simple hash function for strings is to store in an array of length 26 according to the first letter.
Collisions happen when different keys hash to the same value. E.g. apple and ant.
- Linear probing is one way to resolve this. This method simply places the key in the next available slot. This can lead to clustering with popular keys. Search times start degrading to O(n).
- Seperate chaining introduces linked lists with the hash table being an array of pointers. The worst case look up time now becomes O(n/k) where k is the size of the hash table assuming uniform distribution.
It is important to choose a hash function which minimises the chances of collisions occurring.
Good hash functions:
- make use of all information given by a key,
catandcaterpillarshould hash to different values - distribute output uniformly
- generate very differnent values for similar keys
- are fast (will be called frequently and don’t want this to impact performance gained by using a hash table)
Queues
A queue can be stored in an data type which stores elements in order such as an array or a linked list. We also need several methods to add an item to the end of the queue, enqueue, remove an item from the head of the queue, dequeue. Methods to return the size of the queue and if the queue is currently empty are also useful.
typedef struct
{
int head; // stores the index of the element currently at the head of the queue
int length;
int elements[CAPACITY];
}
queue;
bool dequeue(queue* q, int* element)
{
if (q->length > 0)
{
*element = q->elements[q->head];
q->head = (q->head + 1) % CAPACITY;
q->length--;
return true;
}
return false;
}
bool enqueue(queue* q, int element)
{
if (length < capacity)
{
int tail = (q->head + q->length) % CAPACITY;
q->elements[tail] = element;
q->length++;
return true;
}
return false;
}
Trees
Nodes can have parents and children. A leaf node doesn’t have any children and is at the outside edge of the tree. The root node is at the top of the tree and has no parent.
Binary trees
A binary tree is a specific sort of tree in which each node has at most 2 children.
struct node { int data; struct node* left; struct node* right; };
Binary search trees
A binary tree which is order to enable searching. The values in a node’s right sub tree are all greater than that node’s value and the values in a node’s left sub tree are all less than that node’s value.
Balance tree
A tree which has a minimal depth relative to the number of nodes.
bool search(node* root, int value)
{
while (root != NULL)
{
if (root->data == value)
return true;
else if (root-> data > value)
root = root->left;
else
root = root->right;
}
return false;
}
Dictionaries
A dictionary maps keys (usually strings) to values such as ints, chars, pointers to objects etc. This can be implemented using arrays, linked list, hash table, binary tree etc - whichever is most efficent.
Tries
struct node
{
// data
// may be boolean, definitions, pointer to an object etc
// pointers to other nodes
struct node* children[26];
}
The keys are never explicity held in the data structure. Rather values are held at the end of each key e.g. in the 4th node in the chain for the word “bath” the letter h spot would contain the value for “bath”.
Search and insert operations are proportional to the length of the word e.g. 4 operations for a word of length 4. In contrast tries take up a lot of space. Remember that even with large capacities available nowadays, optimisations are often in place to access memory in certain areas, so it pays to keep things as compact as possible.
Tries are uniquely useful for autocompletion e.g. as in google search.
- ← Previous
CS50 - Week 5 - Next →
CS50 - Week 7