80 lines
3.4 KiB
Plaintext
80 lines
3.4 KiB
Plaintext
#import "@preview/ilm:1.4.1": *
|
|
|
|
|
|
#show: ilm.with(
|
|
title: [CS3223 Database System Implementation],
|
|
author: "Yadunand Prem",
|
|
table-of-contents: none,
|
|
)
|
|
#set text(lang: "en", font: ("SF Pro Display"))
|
|
#show raw: set text(font: "SF Mono")
|
|
|
|
= Reference
|
|
|
|
= Lecture 1
|
|
- #link("/Users/yadunut/Documents/2 NUS/S7 CS3223/lect00-intro.pdf", "intro")
|
|
- #link("/Users/yadunut/Documents/2 NUS/S7 CS3223/lect01-storage.pdf", "storage")
|
|
== Disk Access Timings
|
|
- Seek time: move arm to position disk head on track
|
|
- rotation delay: wait to rotate under head
|
|
- transfer time: time to read/write data
|
|
|
|
== Storage Manager
|
|
- You are reading / writing data in Blocks (aka pages)
|
|
=== Buffer Manager
|
|
- Buffer Pool - Main Memory allocated for DBMS, partitioned into pages called frames
|
|
- Clients are the queryplanner, etc.
|
|
- Can request for disk page to be fetched to the buffer
|
|
- Can release a disk page to the buffer
|
|
- Page is dirty if it has been modified & not written back to disk
|
|
- Variables maintained for each frame
|
|
- Pin Count: Number of clients using the frame (initial: 0)
|
|
- dirty: Is the frame dirty? (initial: false)
|
|
- Initial: All frames are free
|
|
- When a client requests a page $p$:
|
|
- Is $p$ in frame $f$?
|
|
- If yes, increment pin count (aka *pinning*)
|
|
- If no, is free list empty?
|
|
- If no, move some frame $f'$ from free list to buffer pool, $"pin count" = 1$, read $p$ into $f'$, return address of frame $f$
|
|
- If yes, Pick frame $f'$ with pin count = 0 for replacement, set $"pin count" = 1$. Is dirty flag of f' = true?
|
|
- If yes, write page in $f'$ to disk
|
|
- read p into $f'$, return address of frame $f'$
|
|
- Buffer Manager replacement policy
|
|
- Random
|
|
- FIFO: First In First Out
|
|
- MRU: Most Recently Used
|
|
- *LRU: Least Recently Used* - Most commonly used
|
|
- Temporal Locality - If a page is accessed, it is likely to be accessed again soon
|
|
- Spatial Locality - If a page is accessed, its neighbours are also likely to be accessed
|
|
- Built with a queue of pointers to frames with $"pin count" = 0$
|
|
- Clock: LRU Variant (aka 2nd Chance)
|
|
- Keep track of a referenced bit for each frame
|
|
- When frame's pin count is 0, check referenced bit
|
|
- If 0, replace frame
|
|
- If 1, set referenced bit to 0 and move to end of queue
|
|
=== File
|
|
- Abstraction
|
|
- Each relation is a file of records
|
|
- Each record is a tuple, with a RID (Record ID) or TID (Tuple ID) as a unique identifier
|
|
- CRUD operations on file
|
|
- Organization: Method of arranging records in a file
|
|
- Heap: Unordered
|
|
- Sorted: Ordered on some search key
|
|
- Hashed: Records are located in blocks via hash fn
|
|
|
|
==== Heap Impl
|
|
- Linked list Impl
|
|
- Problem: FInding a data page requires a linear search, accessing $n$ pages takes $O(n)$ time
|
|
- Page Directory Impl
|
|
- Directory of pages, where the directory stores whether the page is free or occupied (1 bit + pointer)
|
|
- 1 page can contain many directory entries. Searching for a page is much faster.
|
|
- Problem: If directory is in main memory, it can be accessed in $O(1)$ time, but if not, it takes $O(n)$ time to access
|
|
==== Fix length Page Format
|
|
- RID = (Page ID, Slot Number)
|
|
- Packed organization: Store records in contiguous slots
|
|
- Invariant: All recors have to be contiguous
|
|
- Problem: If slot is deleted,
|
|
- the records needs to be shifted, which is expensive
|
|
- Not only that, the slot number changes, which requires updating all RIDs
|
|
- Unpacked organization: Use bit array to maintain free slots
|