diff --git a/cheatsheets/cs2040s/finals.md b/cheatsheets/cs2040s/finals.md index 52086b2..89a1c7a 100644 --- a/cheatsheets/cs2040s/finals.md +++ b/cheatsheets/cs2040s/finals.md @@ -2,12 +2,16 @@ | Algorithm | Best Time Complexity | Average Time Complexity | Worst Time Complexity | InPlace | Stable | | --------- | ------------------------ | -------------------------- | --------------------- | ------- | ------ | | Bubble | $O(n)$ comp, $O(1)$ swap | $O(n^2)$ comp, swap | $O(n^2)$ comp, swap | Yes | Yes | -| Selection | $O(n^2)$, $O(1)$ swap | $O(n^2)$ comp, $O(n)$ swap | $O(n^2)$, O(n) swap | Yes | +| Selection | $O(n^2)$, $O(1)$ swap | $O(n^2)$ comp, $O(n)$ swap | $O(n^2)$, O(n) swap | Yes | No | | Insertion | $O(n)$, $O(1)$ swap | $O(n^2)$ comp, swap | $O(n^2)$ comp, swap | Yes | Yes | | Merge | $O(n \log(n))$ | $O(n \log(n))$ | $O(n \log(n))$ | No | Yes | | Quick | $O(n \log(n))$ | $O(n \log(n))$ | $O(n^2)$ | Yes | No | | Counting | $O(n + k)$ | $O(n + k)$ | $O(n + k)$ | | Radix | $O(nk)$ | $O(nk)$ | $O(nk)$ | No | Yes | + +Counting: O(N+k), + +Radix: $O(w\times (N+k))$: Each item to sort as a string of w digits, sort by rightmost digit (using stable counting sort), k is radix ### Bubble Invariant: After `i` iterations, largest `i` elements are sorted at the back of the array ```java @@ -21,7 +25,6 @@ static void sort(int[] input) { if (input[i] > input[i+1]) swap(input, i, i+1); } } } ``` - ### Selection Invariant: After `i` iterations, smallest `i` elements are sorted to the front of the array ```java @@ -35,10 +38,8 @@ static void sort(int[] input) { ``` ### Insertion Invariant: At the end of kth iteration, leftmost `k` elements are sorted (but not in final correct position) - 1. Outer loop executes N-1 times 2. inner loop runs O(n) if arrayis reverse sorted - ```java static void sort(int[] arr) { // iterate through each element in the array @@ -79,15 +80,274 @@ void quickSort(array A, int low, int high) { } ``` ## Linked List - +- Stack: Last In First Out, Push and Pop from Head +- Queue - First In First Out, Push to tail, pop from head ## Binary Heap -## Union Find +- Height: floor$(\log N)$, Left Node: $2n$, Right Node: $2n+1$, parent node = $\frac{n}{2}$ +- Min Comparisions: n-1, min swaps: 0 +- Max comparisons: 2(n) - 2(no of 1s) - 2(trailing 0 bits) +- Max swaps: $\frac{n}{2}$ until `x=1` then sum + +```pseudocode +insert(node) { + arr.append(node) + bubbleUp(arr) +} +bubbleUp(node) { + while(A[i] > A[parent(i)]) { + swap(A[i], A[parent(v)]); + v = parent(i); + } +} +bubbleDown(node) { + A[1] = A[A.length-1] + i = 1; A.length--; + while(i < A.length) + if A[i] < (L = larger of i's children) { + swap(A[i], L); i = L + } +} +optimised heapify() { + for (i = A.length/2; i >= 1; --i) shiftDown(i); +} +``` +- Change value at index + - Update value + - If new value > old node, then bubbleUp, else bubbleDown +- Root element is the largest, 2nd largest element is child of root, 3rd largest doesn't have to be +- Max swaps: n / 2 integer divide until x = 1 and then sum +- Max comparisons: 2n - 2(num of 1s) - 2(trailing 0s) + +### Invariants +- Every vertex is greater than every value of its children +- Must be a complete binary tree +- Height is always log(N), as every level has to be filled +## Union Find $O(\alpha N)$ +- MIN height needed = $2^h$ nodes +- Keep track of rank[i], uppperbound height of a subtree rooted at vertex i +- `findSet(i)`: From vertex i, recursively go up the tree, until root of tree. Path compression used after each call of findSet +- `IsSameSet(i,j)`: Check if `FindSet(i) == FindSet(j)` +- `unionSet(i,j)`: Shorter tree joined to the taller tree (rank comes into play) +```python +def union(a,b): + ap = findSet(a) + bp = findSet(b) + if rank[ap] < rank[bp]: parents[ap] = bp + elif rank[bp] < rank[ap]: parents[bp] = ap + else: parent[bp] = ap; rank[ap] += 1 +``` ## Hash Table +The hash table size (modulo) sould be a large prime about 2x size of expected number of keys +```python +def hash_string(s): + sum = 0 + for c in s: sum = (sum*26)%M + ord(c)%M + return sum +``` +- Open Addressing: All keys in single array, collision resolved by probing alternative address +- Closed Addressing: Adjacency List, append collided keys to auxiliary data structure +### Linear +`i = (h(v) + k*1)%M` +- On deletion set the value to removed, it knows that the values after it are valid during probing +- If the values tend to cluster, linear probing is bad +### Quadratic +`i = (h(v) + k*k)%M` +- Issues is that we might have infinite cycles with this scheme. +- If $\alpha < 0.5$ and M is prime, then we can always find empty slot +### Double Hashing +`i = (h(v) + k*h2(v))`, h2 is usually a smaller prime than M. ## Binary Search Trees +- Height: $O (\log_2(N)) < h < N$ +- BST Property: For vertex X, all vertices on the left subtree of X are strictly smaller than X and all vertices on the right subtree are strictly greater than X +- `minHeight=ceil(log(n))-1` +- `maxHeight=n-1` + +```python +def search(target): + if curr == None: return None + if curr == target: return curr + if curr < target: return search(curr.right) + return search(curr.left) +def left(node): return node if node.left == None else return left(node.left) +def successor(node): + if node.right != null: return left(node.right) + p = node.parent, T = node + while(p != null and T == p.right): T = p, p = T.parent + if p == null: return None + return p +def remove(node, key): + if node == None: return None + if node.val < key: node.right = remove(node.right, key) + if node.val > key: node.left = remove(node.left, key) + if node.left == None and node.right == None: + node == None + elif node.left == None and node.right != None: + node.right.parent = node.parent + node = node.right + elif node.right == None and node.left != None: + node.left.parent = node.parent + node = node.left + else: # both nodes exist, find successor + successorV = successor(key) + node.val = successorV # replace this with successor + delete(T.right, successor) # delete successor +``` +```python +def validBst(node, minV, maxV): + if not node: return True + if node.val <= minV or node.val >= maxV: + return False + left = validBst(node.left, minV, node.value) + right = validBst(node.right, node.value, maxV) + return left and right +``` +## BBST AVL Tree +- each node augmented with height +- height: $h < 2\log(N)$ +- `height=-1` (if empty tree), `height=max(left.h, right.h)+1`, computed at end of `insert(v)` or `remove(v)` operation +- `BF(n)=left.h-right.h` +- height balanced IF $|$left.h $-$right.h$|\leq 1$ +```python +def h(node): node.height if node else -1 +def rotateLeft(node): + if node.right == None: return node + w = node.right + w.parent = node.parent + node.parent = w + if (w.left != None): w.left.parent = node + w.left = node + node.height = max(h(node.left), h(node.right))+1 + w.height = max(h(w.left), h(w.right))+1 + return w + +def balance(node): + bal = h(node.left) - h(node.right) + if bal == 2: # left heavy + bal2 = h(node.left.left)-h(node.left.right) + if bal2 != 1: node.left = rotateLeft(node.left) + node = rotateRight(node) + elif bal == -2: + bal2 = h(node.right.left)-h(node.right.right) + if bal2 != -1: node.right = rotateRight(node.right) + node = rotateLeft(node) +def insert(node, val): + if node == None: return Node(val) + if node.key < val: + node.right = insert(node.right, v) + node.right.parent = node + if node.key > val: + node.left = insert(node.left, v) + node.left.parent = node + node = balance(node) + node.height = max(h(node.left), h(node.right)) +1 + return node +``` +### Invariant ## Graph Structures +### Trees +- V vertices, E = V-1 edges, acyclic, 1 unique path between any pair of vertices +- Root a tree by running BFS/DFS from the root +```python +def isTree(AL, directed=True): + def dfs(node, visited, parent): + visited[node] = True + for neighbour in AL[node]: + if not visited[neighbour]: + if dfs(neighbour, visited, node): return True + elif neighbour != parent: return True + return False + n = len(AL) + visited = [False]*n + if dfs(0, visited, -1): + return False #cycle detected. + if False in visited: + return False # unconnected + edges = sum(len(n) for n in AL.values()) + if directed: + if edges != n-1: return False + else: + if edges != 2*(n-1): return False + return True +``` +### Complete Graph +- $V$ vertices, $E=\frac{V\times(V-1)}{2}$ edges +```python +def isComplete(AL): + n = len(AL) + for i in range(n): + for j in range(i+1, n): + if j not in AL[i]: return False +``` +### Bipartite +- V vertices that can be split to 2 sets, where no edge between members of same set +- no odd length cycle +- Complete Bipartite: all V from one set connected to all V from other set. +```python +def isBipartite(AL): + n = len(AL) + colors = [-1]*n + for start in range(n): # incase it isn't connected + if colors[start] == -1: + q = deque([(start,0)]) # (vertex,color) + colors[start] = 0 + while q: + curr, color = q.popleft() + for neighbour in AL[curr]: + if colors[neighbour] == -1: + colors[neighbour] = 1-color # flip color + q.append((neighbour, 1-color)) + elif colors[neighbour] == color: + return False + return True +``` +### DAG +- Directed, no cycle +```python +def isDag(AL): + def dfs(node, visited, stack): + visited[node] = True + stack[node] = True + for n in AL[node]: + if not visited[n]: + if dfs(n, visited, stack): + return True # pass it down + elif stack[n]: + return True # back edge found + stack[node] = False + return False + n = len(AL) + visited = [False]*n + stack = [False]*n + for i in range(n): + if not visited[i]: + if dfs(i, visited,stack): + return False + return True +``` +```python +def isDagBfs(AL): + n = len(AL) + inDeg = [0]*n + for nbs in AL: + for nb in nbs: + inDeg[nb] += 1 + q = deque([v for v in range(n) if inDeg[v] == 0]) # queue of 0 indeg + while q: + curr = q.popleft() + for nb in AL[curr]: + inDeg[nb] -= 1 + if inDeg[nb] == 0: q.append(nb) + return all(i == 0 for i in inDeg) # all inDeg == 0 +``` +### Storage +- Adjacency Matrix - Check existance of edge in $O(1)$. Storage $O(V^2)$ +- Adjacency List - Storage $O(V+E)$ +- Edge List - Storage $O(E)$ ## Graph Traversal -### Topo Sorting -#### Lexographic Kahn's Algorithm $O(V\log(V)+E)$ +- Reachability Test: DFS/BFS and check if visited +- ID CCs: run DFS on a vertex. all visited=true is in same CC +- count CCs: for all u in V, if unvisited, incr CCCount, then DFS +### Lexographic Kahn's Algorithm (Topo Sort) $O(V\log(V)+E)$ Non Lexographic Variant is O(V+E) ```python from heapq import heappush, heappop @@ -114,7 +374,21 @@ def topoSort(AL): return res ``` ## Single Source Shortest Path +```python +def relax(from,to, weight): + if dist[v] > dist[u]+weight: # can be shortened + dist[v] = d[u]+weight + path[v] = u + return True + return False +``` +- On unweighted graphs: BFS +- On graphs without negative weight: Dijkstra +- On graphs without negative weight cycles: Modified Dijkstra +- On Tree: BFS/DFS +- On DAG: DP ### Bellman Ford Algorithm $O(V\times E)$ +- Can detect negative weight cycles ```python def bellman_ford(AL, numV, start): dist = [INF for _ in range(V)] @@ -138,11 +412,38 @@ def bellman_ford(AL, numV, start): print(start, u, dist[u]) ``` ### Breadth First Search +Instead of standard visited array, replace with dist[] array, where initial distances are infinite. Then set source to 0. if D[v] = inf, then set it to d[u]+1 ### Dijkstra's Algorithm $O((V+E)\log Vk)$ -#### Modified Dijkstra's Algorithm -### Dynamic Programming +- BFS with a priority queue +```python +def dijkstra(AL, start): + V = len(AL) + dist = [float('inf') for u in range(V)] + dist[start] = 0 + pq = [] + heappush(pq, (0, start)) + while pq: + dist, u = heappop(pq) + if (dist > dist[u]): continue + for v, w in AL[u]: + if dist[u]+w >= dist[v]: continue + dist[v] = dist[u]+w + heappush(pq, (dist[v], v)) +``` +### DFS O(V) +DFS can be used to solve SSSP on a weighted tree. Since there is only 1 unique path that connects source to another vertex, that path is teh shortest path +### Dynamic Programming O(V+E) +SSSP on a DAG. Find the topo sort and relax in the order of the topo sort. +```python +order = khan_toposort() # O(V+E) +while not order.empty(): + u = order.popleft() + for v, w in AL[u]: relax(u, v, w) +``` ## Minimum Spanning Tree +- Tree that connects to all vertices of G ### Kruskal's Algorithm $O(E\log(V))$ +- sort edges, then loop over these edges, and greedily take the next edge that does not cause cycles If the weight of the edges is bounded, then you can use counting sort to bring the complexity down to O(E) ```python def kruskalMST(EL, numV, numE): # edge list has (weight, from, to) @@ -159,7 +460,9 @@ def kruskalMST(EL, numV, numE): # edge list has (weight, from, to) UF.unionSet(u, v) print(cost) ``` -### Prim's Algorithm +### Prim's Algorithm $O(E \log(V))$ +- Starts from a vertex, and queues all edges to PQ +- if head vertex has not been visited, take that edge and repeat from 1 ```python def prims(AL, numV): pq = [] @@ -179,3 +482,13 @@ def prims(AL, numV): if not taken[v]: heappush(pq, (w, v)) print(cost) ``` +## Miscellaneous +- make a graph where the vertices are a pair +- define the edges as the following + - for each , make a directed edge to + - for each , make a directed edge to + - (still undefined: what is the weight?) +- run multi-source shortest path, where the sources are and + - (still undefined: what shortest path algorithm to be used here?) +- take the shortest path to as the answer + - (is this correct? any other vertex to consider?)