511 lines
17 KiB
Markdown
511 lines
17 KiB
Markdown
## Sorting
|
|
| Algorithm | Best Time Complexity | Average Time Complexity | Worst Time Complexity | InPlace | Stable |
|
|
| --------- | ------------------------ | -------------------------- | --------------------- | ------- | ------ |
|
|
| Bubble | $O(n)$ comp, $O(1)$ swap | $O(n^2)$ comp, swap | $O(n^2)$ comp, swap | Yes | Yes |
|
|
| Selection | $O(n^2)$, $O(1)$ swap | $O(n^2)$ comp, $O(n)$ swap | $O(n^2)$, O(n) swap | Yes | No |
|
|
| Insertion | $O(n)$, $O(1)$ swap | $O(n^2)$ comp, swap | $O(n^2)$ comp, swap | Yes | Yes |
|
|
| Merge | $O(n \log(n))$ | $O(n \log(n))$ | $O(n \log(n))$ | No | Yes |
|
|
| Quick | $O(n \log(n))$ | $O(n \log(n))$ | $O(n^2)$ | Yes | No |
|
|
| Counting | $O(n + k)$ | $O(n + k)$ | $O(n + k)$ |
|
|
| Radix | $O(nk)$ | $O(nk)$ | $O(nk)$ | No | Yes |
|
|
|
|
Counting: O(N+k),
|
|
|
|
Radix: $O(w\times (N+k))$: Each item to sort as a string of w digits, sort by rightmost digit (using stable counting sort), k is radix
|
|
### Bubble
|
|
Invariant: After `i` iterations, largest `i` elements are sorted at the back of the array
|
|
```java
|
|
private static void swap(int[] arr, int a, int b) {
|
|
int tmp = arr[a]; arr[a] = arr[b]; arr[b] = tmp; }
|
|
|
|
static void sort(int[] input) {
|
|
int N = input.length;
|
|
for(; N > 0; N--) {
|
|
for (int i = 0; i < N-1; i++) {
|
|
if (input[i] > input[i+1]) swap(input, i, i+1);
|
|
} } }
|
|
```
|
|
### Selection
|
|
Invariant: After `i` iterations, smallest `i` elements are sorted to the front of the array
|
|
```java
|
|
static void sort(int[] input) {
|
|
for (int i = 0; i < input.length; i++) {
|
|
int smallest = i;
|
|
for (int j = i; j < input.length; j++) {
|
|
if (input[smallest] > input[j]) smallest = j;
|
|
}
|
|
swap(input, i, smallest); } }
|
|
```
|
|
### Insertion
|
|
Invariant: At the end of kth iteration, leftmost `k` elements are sorted (but not in final correct position)
|
|
1. Outer loop executes N-1 times
|
|
2. inner loop runs O(n) if arrayis reverse sorted
|
|
```java
|
|
static void sort(int[] arr) {
|
|
// iterate through each element in the array
|
|
for (int i = 1; i < arr.length; i++) {
|
|
int tmp = arr[i];
|
|
if (tmp > arr[i - 1]) continue;
|
|
int j = i;
|
|
for(; j > 0; j--) {
|
|
arr[j] = arr[j-1];
|
|
if (tmp > arr[j-1]) break;
|
|
}
|
|
arr[j] = tmp; } }
|
|
```
|
|
### Merge
|
|
1. O(N log N) performance guarentee, no matter the original ordering of input
|
|
2. Not inplace
|
|
3. Needs N space
|
|
### Quick Sort
|
|
```java
|
|
int partition(array A, int I, int J) {
|
|
int p = a[i];
|
|
int m = i; // S1 and S2 are empty
|
|
for (int k = i+1; k <=j k++) { // explore unknown
|
|
if ((A[k] < p) || ((A[k] == p) && (rand()%2==0))) {
|
|
++m;
|
|
swap(A[k], A[m]); // exchange 2 values
|
|
}
|
|
}
|
|
swap(A[i], A[m]); // put pivot in center
|
|
return m; // return index of pivot
|
|
}
|
|
|
|
void quickSort(array A, int low, int high) {
|
|
if (low >= high) return;
|
|
int m = partition(a, low, high);
|
|
quickSort(A, low, m-1);
|
|
quickSort(A, m+1, high);
|
|
}
|
|
```
|
|
## Linked List
|
|
- Stack: Last In First Out, Push and Pop from Head
|
|
- Queue - First In First Out, Push to tail, pop from head
|
|
## Binary Heap
|
|
- Height: floor$(\log N)$, Left Node: $2n$, Right Node: $2n+1$, parent node = $\frac{n}{2}$
|
|
- Min Comparisions: n-1, min swaps: 0
|
|
- Max comparisons: 2(n) - 2(no of 1s) - 2(trailing 0 bits)
|
|
- Max swaps: $\frac{n}{2}$ until `x=1` then sum
|
|
|
|
```pseudocode
|
|
insert(node) {
|
|
arr.append(node)
|
|
bubbleUp(arr)
|
|
}
|
|
bubbleUp(node) {
|
|
while(A[i] > A[parent(i)]) {
|
|
swap(A[i], A[parent(v)]);
|
|
v = parent(i);
|
|
}
|
|
}
|
|
bubbleDown(node) {
|
|
A[1] = A[A.length-1]
|
|
i = 1; A.length--;
|
|
while(i < A.length)
|
|
if A[i] < (L = larger of i's children) {
|
|
swap(A[i], L); i = L
|
|
}
|
|
}
|
|
optimised heapify() {
|
|
for (i = A.length/2; i >= 1; --i) shiftDown(i);
|
|
}
|
|
```
|
|
- Change value at index
|
|
- Update value
|
|
- If new value > old node, then bubbleUp, else bubbleDown
|
|
- Root element is the largest, 2nd largest element is child of root, 3rd largest doesn't have to be
|
|
- Max swaps: n / 2 integer divide until x = 1 and then sum
|
|
- Max comparisons: 2n - 2(num of 1s) - 2(trailing 0s)
|
|
|
|
### Invariants
|
|
- Every vertex is greater than every value of its children
|
|
- Must be a complete binary tree
|
|
- Height is always log(N), as every level has to be filled
|
|
## Union Find $O(\alpha N)$
|
|
- MIN height needed = $2^h$ nodes
|
|
- Keep track of rank[i], uppperbound height of a subtree rooted at vertex i
|
|
- `findSet(i)`: From vertex i, recursively go up the tree, until root of tree. Path compression used after each call of findSet
|
|
- `IsSameSet(i,j)`: Check if `FindSet(i) == FindSet(j)`
|
|
- `unionSet(i,j)`: Shorter tree joined to the taller tree (rank comes into play)
|
|
```python
|
|
def union(a,b):
|
|
ap = findSet(a)
|
|
bp = findSet(b)
|
|
if rank[ap] < rank[bp]: parents[ap] = bp
|
|
elif rank[bp] < rank[ap]: parents[bp] = ap
|
|
else: parent[bp] = ap; rank[ap] += 1
|
|
```
|
|
## Hash Table
|
|
The hash table size (modulo) sould be a large prime about 2x size of expected number of keys
|
|
```python
|
|
def hash_string(s):
|
|
sum = 0
|
|
for c in s: sum = (sum*26)%M + ord(c)%M
|
|
return sum
|
|
```
|
|
- Open Addressing: All keys in single array, collision resolved by probing alternative address
|
|
- Closed Addressing: Adjacency List, append collided keys to auxiliary data structure
|
|
### Linear
|
|
`i = (h(v) + k*1)%M`
|
|
- On deletion set the value to removed, it knows that the values after it are valid during probing
|
|
- If the values tend to cluster, linear probing is bad
|
|
### Quadratic
|
|
`i = (h(v) + k*k)%M`
|
|
- Issues is that we might have infinite cycles with this scheme.
|
|
- If $\alpha < 0.5$ and M is prime, then we can always find empty slot
|
|
### Double Hashing
|
|
`i = (h(v) + k*h2(v))`, h2 is usually a smaller prime than M.
|
|
## Binary Search Trees
|
|
- Height: $O (\log_2(N)) < h < N$
|
|
- BST Property: For vertex X, all vertices on the left subtree of X are strictly smaller than X and all vertices on the right subtree are strictly greater than X
|
|
- `minHeight=ceil(log(n))-1`
|
|
- `maxHeight=n-1`
|
|
|
|
```python
|
|
def search(target):
|
|
if curr == None: return None
|
|
if curr == target: return curr
|
|
if curr < target: return search(curr.right)
|
|
return search(curr.left)
|
|
def left(node): return node if node.left == None else return left(node.left)
|
|
def successor(node):
|
|
if node.right != null: return left(node.right)
|
|
p = node.parent, T = node
|
|
while(p != null and T == p.right): T = p, p = T.parent
|
|
if p == null: return None
|
|
return p
|
|
def remove(node, key):
|
|
if node == None: return None
|
|
if node.val < key: node.right = remove(node.right, key)
|
|
if node.val > key: node.left = remove(node.left, key)
|
|
if node.left == None and node.right == None:
|
|
node == None
|
|
elif node.left == None and node.right != None:
|
|
node.right.parent = node.parent
|
|
node = node.right
|
|
elif node.right == None and node.left != None:
|
|
node.left.parent = node.parent
|
|
node = node.left
|
|
else: # both nodes exist, find successor
|
|
successorV = successor(key)
|
|
node.val = successorV # replace this with successor
|
|
delete(T.right, successor) # delete successor
|
|
```
|
|
```python
|
|
def validBst(node, minV, maxV):
|
|
if not node: return True
|
|
if node.val <= minV or node.val >= maxV:
|
|
return False
|
|
left = validBst(node.left, minV, node.value)
|
|
right = validBst(node.right, node.value, maxV)
|
|
return left and right
|
|
```
|
|
## BBST AVL Tree
|
|
- each node augmented with height
|
|
- height: $h < 2\log(N)$
|
|
- `height=-1` (if empty tree), `height=max(left.h, right.h)+1`, computed at end of `insert(v)` or `remove(v)` operation
|
|
- `BF(n)=left.h-right.h`
|
|
- height balanced IF $|$left.h $-$right.h$|\leq 1$
|
|
```python
|
|
def h(node): node.height if node else -1
|
|
def rotateLeft(node):
|
|
if node.right == None: return node
|
|
w = node.right
|
|
w.parent = node.parent
|
|
node.parent = w
|
|
if (w.left != None): w.left.parent = node
|
|
w.left = node
|
|
node.height = max(h(node.left), h(node.right))+1
|
|
w.height = max(h(w.left), h(w.right))+1
|
|
return w
|
|
|
|
def balance(node):
|
|
bal = h(node.left) - h(node.right)
|
|
if bal == 2: # left heavy
|
|
bal2 = h(node.left.left)-h(node.left.right)
|
|
if bal2 != 1: node.left = rotateLeft(node.left)
|
|
node = rotateRight(node)
|
|
elif bal == -2:
|
|
bal2 = h(node.right.left)-h(node.right.right)
|
|
if bal2 != -1: node.right = rotateRight(node.right)
|
|
node = rotateLeft(node)
|
|
def insert(node, val):
|
|
if node == None: return Node(val)
|
|
if node.key < val:
|
|
node.right = insert(node.right, v)
|
|
node.right.parent = node
|
|
if node.key > val:
|
|
node.left = insert(node.left, v)
|
|
node.left.parent = node
|
|
node = balance(node)
|
|
node.height = max(h(node.left), h(node.right)) +1
|
|
return node
|
|
```
|
|
### Invariant
|
|
## Graph Structures
|
|
### Trees
|
|
- V vertices, E = V-1 edges, acyclic, 1 unique path between any pair of vertices
|
|
- Root a tree by running BFS/DFS from the root
|
|
```python
|
|
def isTree(AL, directed=True):
|
|
def dfs(node, visited, parent):
|
|
visited[node] = True
|
|
for neighbour in AL[node]:
|
|
if not visited[neighbour]:
|
|
if dfs(neighbour, visited, node): return True
|
|
elif neighbour != parent: return True
|
|
return False
|
|
n = len(AL)
|
|
visited = [False]*n
|
|
if dfs(0, visited, -1):
|
|
return False #cycle detected.
|
|
if False in visited:
|
|
return False # unconnected
|
|
edges = sum(len(n) for n in AL.values())
|
|
if directed:
|
|
if edges != n-1: return False
|
|
else:
|
|
if edges != 2*(n-1): return False
|
|
return True
|
|
```
|
|
### Complete Graph
|
|
- $V$ vertices, $E=\frac{V\times(V-1)}{2}$ edges
|
|
```python
|
|
def isComplete(AL):
|
|
n = len(AL)
|
|
for i in range(n):
|
|
for j in range(i+1, n):
|
|
if j not in AL[i]: return False
|
|
```
|
|
### Bipartite
|
|
- V vertices that can be split to 2 sets, where no edge between members of same set
|
|
- no odd length cycle
|
|
- Complete Bipartite: all V from one set connected to all V from other set.
|
|
```python
|
|
def isBipartite(AL):
|
|
n = len(AL)
|
|
colors = [-1]*n
|
|
for start in range(n): # incase it isn't connected
|
|
if colors[start] == -1:
|
|
q = deque([(start,0)]) # (vertex,color)
|
|
colors[start] = 0
|
|
while q:
|
|
curr, color = q.popleft()
|
|
for neighbour in AL[curr]:
|
|
if colors[neighbour] == -1:
|
|
colors[neighbour] = 1-color # flip color
|
|
q.append((neighbour, 1-color))
|
|
elif colors[neighbour] == color:
|
|
return False
|
|
return True
|
|
```
|
|
### DAG
|
|
- Directed, no cycle
|
|
```python
|
|
def isDag(AL):
|
|
def dfs(node, visited, stack):
|
|
visited[node] = True
|
|
stack[node] = True
|
|
for n in AL[node]:
|
|
if not visited[n]:
|
|
if dfs(n, visited, stack):
|
|
return True # pass it down
|
|
elif stack[n]:
|
|
return True # back edge found
|
|
stack[node] = False
|
|
return False
|
|
n = len(AL)
|
|
visited = [False]*n
|
|
stack = [False]*n
|
|
for i in range(n):
|
|
if not visited[i]:
|
|
if dfs(i, visited,stack):
|
|
return False
|
|
return True
|
|
```
|
|
```python
|
|
def isDagBfs(AL):
|
|
n = len(AL)
|
|
inDeg = [0]*n
|
|
for nbs in AL:
|
|
for nb in nbs:
|
|
inDeg[nb] += 1
|
|
q = deque([v for v in range(n) if inDeg[v] == 0]) # queue of 0 indeg
|
|
while q:
|
|
curr = q.popleft()
|
|
for nb in AL[curr]:
|
|
inDeg[nb] -= 1
|
|
if inDeg[nb] == 0: q.append(nb)
|
|
return all(i == 0 for i in inDeg) # all inDeg == 0
|
|
```
|
|
### Storage
|
|
- Adjacency Matrix - Check existance of edge in $O(1)$. Storage $O(V^2)$
|
|
- Adjacency List - Storage $O(V+E)$
|
|
- Edge List - Storage $O(E)$
|
|
## Graph Traversal
|
|
- Reachability Test: DFS/BFS and check if visited
|
|
- ID CCs: run DFS on a vertex. all visited=true is in same CC
|
|
- count CCs: for all u in V, if unvisited, incr CCCount, then DFS
|
|
### Lexographic Kahn's Algorithm (Topo Sort) $O(V\log(V)+E)$
|
|
Non Lexographic Variant is O(V+E)
|
|
```python
|
|
from heapq import heappush, heappop
|
|
def topoSort(AL):
|
|
V = len(AL)
|
|
in_deg = [0] * V # Build the indegree of each vertex
|
|
for u in range(V):
|
|
for w in AL[u]:
|
|
in_deg[w] += 1
|
|
q = [] # Push all elements with indegree 0 to be processed
|
|
for i in range(V):
|
|
if in_deg[i] == 0: heappush(q, i)
|
|
result = []
|
|
count = 0 # to check for cycles
|
|
while q:
|
|
u = heappop(q) # normal push / pop for non lexographic variants
|
|
result.append(u)
|
|
count +=1
|
|
for w in AL[u]:
|
|
in_deg[w] -= 1
|
|
if in_deg[w] == 0: heappush(q, w)
|
|
if count != V:
|
|
return []
|
|
return res
|
|
```
|
|
## Single Source Shortest Path
|
|
```python
|
|
def relax(from,to, weight):
|
|
if dist[v] > dist[u]+weight: # can be shortened
|
|
dist[v] = d[u]+weight
|
|
path[v] = u
|
|
return True
|
|
return False
|
|
```
|
|
- On unweighted graphs: BFS
|
|
- On graphs without negative weight: Dijkstra
|
|
- On graphs without negative weight cycles: Modified Dijkstra
|
|
- On Tree: BFS/DFS
|
|
- On DAG: DP
|
|
### Bellman Ford Algorithm $O(V\times E)$
|
|
- Can detect negative weight cycles
|
|
```python
|
|
def bellman_ford(AL, numV, start):
|
|
dist = [INF for _ in range(V)]
|
|
dist[start] = 0
|
|
for _ in range(0, V-1):
|
|
modified = False
|
|
for u in range(0, V):
|
|
if (dist[u] != INF):
|
|
for v, w in AL[u]:
|
|
if(dist[u]+w >= dist[v]): continue
|
|
dist[v] = dist[u]+w
|
|
modified = True
|
|
if (not mofified): break
|
|
hasNegativeCycle = False
|
|
for _ in range(0, V):
|
|
if (dist[u] != INF):
|
|
for v, w in AL[u]:
|
|
if (dist[v] > dist[u]+w): hasNegativeCycle = True
|
|
if hasNegativeCycles: print("Negative cycles")
|
|
for u in range(V):
|
|
print(start, u, dist[u])
|
|
```
|
|
### Breadth First Search
|
|
Instead of standard visited array, replace with dist[] array, where initial distances are infinite. Then set source to 0. if D[v] = inf, then set it to d[u]+1
|
|
### Dijkstra's Algorithm $O((V+E)\log Vk)$
|
|
- BFS with a priority queue
|
|
```python
|
|
def dijkstra(AL, start):
|
|
V = len(AL)
|
|
dist = [float('inf') for u in range(V)]
|
|
dist[start] = 0
|
|
pq = []
|
|
heappush(pq, (0, start))
|
|
while pq:
|
|
dist, u = heappop(pq)
|
|
if (dist > dist[u]): continue
|
|
for v, w in AL[u]:
|
|
if dist[u]+w >= dist[v]: continue
|
|
dist[v] = dist[u]+w
|
|
heappush(pq, (dist[v], v))
|
|
```
|
|
### DFS O(V)
|
|
DFS can be used to solve SSSP on a weighted tree. Since there is only 1 unique path that connects source to another vertex, that path is teh shortest path
|
|
### Dynamic Programming O(V+E)
|
|
SSSP on a DAG. Find the topo sort and relax in the order of the topo sort.
|
|
```python
|
|
order = khan_toposort() # O(V+E)
|
|
while not order.empty():
|
|
u = order.popleft()
|
|
for v, w in AL[u]: relax(u, v, w)
|
|
```
|
|
## Minimum Spanning Tree
|
|
- Tree that connects to all vertices of G
|
|
### Kruskal's Algorithm $O(E\log(V))$
|
|
- sort edges, then loop over these edges, and greedily take the next edge that does not cause cycles
|
|
If the weight of the edges is bounded, then you can use counting sort to bring the complexity down to O(E)
|
|
```python
|
|
def kruskalMST(EL, numV, numE): # edge list has (weight, from, to)
|
|
EL.sort()
|
|
UF = UnionFind(numV)
|
|
count = 0
|
|
cost = 0
|
|
for i in range(E):
|
|
if count == V-1: break
|
|
w, u, v = EL[i]
|
|
if (not UF.isSameSet(u, v)):
|
|
num_taken += 1
|
|
mst_cost += w
|
|
UF.unionSet(u, v)
|
|
print(cost)
|
|
```
|
|
### Prim's Algorithm $O(E \log(V))$
|
|
- Starts from a vertex, and queues all edges to PQ
|
|
- if head vertex has not been visited, take that edge and repeat from 1
|
|
```python
|
|
def prims(AL, numV):
|
|
pq = []
|
|
taken = [0 for i in range(numV)]
|
|
cost = 0
|
|
count =0
|
|
taken[0] = 1
|
|
for v,w in AL[0]: heappush(pq, (w, v)) # starting elements into the heap
|
|
while pq:
|
|
if count = numV-1: break
|
|
w, u = heappop(pq)
|
|
if not taken[u]:
|
|
taken[u] = 1
|
|
cost += w
|
|
count += 1
|
|
for v, w in AL[u]:
|
|
if not taken[v]: heappush(pq, (w, v))
|
|
print(cost)
|
|
```
|
|
## Miscellaneous
|
|
- make a graph where the vertices are a pair <rectangle, horizontal/vertical>
|
|
- define the edges as the following
|
|
- for each <rectangle, vertical>, make a directed edge to <neighboring rectangle in vertical direction, horizontal>
|
|
- for each <rectangle, horizontal>, make a directed edge to <neighboring rectangle in horizontal direction, vertical>
|
|
- (still undefined: what is the weight?)
|
|
- run multi-source shortest path, where the sources are <starting rectangle, vertical> and <starting rectangle, horizontal>
|
|
- (still undefined: what shortest path algorithm to be used here?)
|
|
- take the shortest path to <target rectangle, vertical> as the answer
|
|
- (is this correct? any other vertex to consider?)
|
|
|
|
### Jetpack solution (DFS)
|
|
- add initial position to queue
|
|
- while queue
|
|
- add up if it has not been visited
|
|
- add down if it has not been visited
|
|
- backtrack on the results to generate path
|
|
|
|
### Conquest (dijkstra)
|
|
- create pqueue of all places i can currently visit
|
|
- while pq
|
|
- if army at pq > current army, break
|
|
- if i've been here before, continue
|
|
- add current to my army
|
|
- visit current
|
|
- add current adjacent to heap
|
|
- return armysize |