写在前面

一些话

对数据库存储引擎的结构比较感兴趣,所以看了CMU 15-445这门课,想要加深对存储引擎的理解,也想自己从零到一亲手实现一个项目,用实践检验自己的编程水平,同时可以丰富简历的内容。

本人主要学习的是Java这门语言,但这门课的lab是用C++实现的,很喜欢这门课,所以挑战一下用Java实现这些lab。

写这篇博客是为了整理思路,方便以后复习,互联网上用java写这个lab的项目寥寥无几,也希望可以为想用Java完成这门课的小伙伴提供一些思路。

本人水平很差,如有错误或更好的实现方法请务必在评论区指出来,不胜感激。

课程相关资源

我看的是2019年的课,实验做的是2020年的

课程视频可以在油管和B站找到,这里我推荐simniso这个团队,他们翻译得很好

实验一说明地址:Project #1 - Buffer Pool | CMU 15-445/645 :: Intro to Database Systems (Fall 2020)

实验官方Git hub地址:cmu-db/bustub: The BusTub Relational Database Management System (Educational) (github.com)

Buffer Pool缓冲池

课程视频里讲得很详细,可以配合《MySQL是怎样运行的》这本书的第17章食用

下面讲一下我理解的Buffer Pool的结构

  1. Buffer Pool由若干个页组成,这些页是磁盘中的页的复制版(称为缓冲页frame)
  2. 每一个缓冲页对应一个控制块(我称之为block)
  3. Buffer Pool中有若干条由控制块组成的链表:free链表,flush链表,LRU链表,……

具体这些结构是干嘛的,下面用到的时候再说吧

TASK #1 - LRU REPLACEMENT POLICY

任务描述:实现Buffer Pool中的LRU(Least Recently Used)链表

数据结构中的LRU

LRU链表原理没什么好说的,可以先去力扣做一下这道题:146. LRU 缓存 - 力扣(LeetCode)

Buffer Pool中的LRU

因为缓冲池的大小是有限的,所以在空间不够时,利用LRU结构将内存中不常用的页释放掉,为新加载的页腾出空间

思路

参考《MySQL是怎样运行的》第282页

当访问某个页时,可以按照下面的方式处理LRU链表:

  • 如果该页不在Buffer Pool中,在把该页加载到缓冲池中时,把对应的控制块塞到LRU链表的头部
  • 如果该页已经在Buffer Pool中,则直接把对应的控制块移动到LRU链表的头部

实验任务

  • Victim(T*) : Remove the object that was accessed the least recently compared to all the elements being tracked by the Replacer, store its contents in the output parameter and return True. If the Replacer is empty return False.
  • Pin(T) : This method should be called after a page is pinned to a frame in the BufferPoolManager. It should remove the frame containing the pinned page from the LRUReplacer.
  • Unpin(T) : This method should be called when the pin_count of a page becomes 0. This method should add the frame containing the unpinned page to the LRUReplacer.
  • Size() : This method returns the number of frames that are currently in the LRUReplacer.

Block

1
2
3
4
5
6
7
8
9
10
11
12
import static common.Config.INVALID_PAGE_ID;
public class Block {
public Page frame;
Block lru_next;
Block lru_prev;
int frameId;
boolean isDirty;
int pinCount;
public Block(){
frameId = INVALID_PAGE_ID;
}
}

1. Pin和Unpin

Pin:标记当前页正在被使用,从LRU中暂时移除该页对应的控制块

Unpin:当前页已经使用完毕,加到LRU头部

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public void pin(Block block) {
synchronized (this) {
block.pinCount++;
size--;//暂时从链表中移除
removeBlock(block);
}
}
public void unpin(Block block) {
synchronized (this) {
if(size() == capacity) throw new RuntimeException("Out of memory");
size++;
block.pinCount--;
addToHead(block);
}
}

题外话

为什么都是操作控制块block而不直接操作缓冲页frame呢?

个人理解如果不用控制块,而是把控制信息放在缓冲页中,等到把缓冲页刷回磁盘时,还要把这些无用的控制信息删掉,就很麻烦

所以用控制块保存控制信息,操作更简单,而且减少了每个缓冲页的额外数据的大小

而且控制块可以重复利用,同一个控制块不同时期可以对应不同的页

2. Victim函数实现

Victim是用来找到一个最不常用的缓冲页(LRU末尾的控制块对应的缓冲页)

1
2
3
4
5
6
7
8
public Block victim() {
synchronized (this) {
Block block = tail.lru_prev;
if(block.pinCount > 0) throw new RuntimeException("All the pages are pinned");
moveToHead(block);
return block;
}
}

size()没啥好说的就不说了

TASK #2 - BUFFER POOL MANAGER

Buffer Pool上面介绍过了

实验任务

  • FetchPageImpl(page_id)
  • NewPageImpl(page_id)
  • UnpinPageImpl(page_id, is_dirty)
  • FlushPageImpl(page_id)
  • DeletePageImpl(page_id)
  • FlushAllPagesImpl()

1. FlushPageImpl(page_id) & DeletePageImpl(page_id)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//调用该方法时要确保pinCount == 0
public void flushPg(int pageId){//将指定的页的内容刷回磁盘
synchronized (this){
Block block = idMap.get(pageId);
if(idMap.containsKey(pageId) && block.isDirty){
diskManager.writePage(pageId, block.frame.getData());
flushList.remove(block);
block.isDirty = false;
}
}
}

public void deletePg(Block block){
// 0. Make sure you call DeallocatePage!
// 1. Search the page table for the requested page (P).
// 1. If P does not exist, return true.
// 2. If P exists, but has a non-zero pin-count, return false. Someone is using the page.
// 3. Otherwise, P can be deleted. Remove P from the page table, reset its metadata and return it to the free list.
synchronized (this){
if(idMap.containsValue(block)) {
if(block.pinCount > 0) throw new RuntimeException("Someone is using the page");
if(block.isDirty) flushPg(block);
block.frame = null;
idMap.remove(pageId);
freeList.add(block);
}
}
}

2. NewPageImpl(page_id)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
//在buffer pool中新建一个页(磁盘中某个页的复制)
public Page newPg(int pageId) {
// 0. Make sure you call AllocatePage!
// 1. If all the pages in the buffer pool are pinned, return nullptr.
// 2. Pick a victim page P from either the free list or the replacer. Always pick from the free list first.
// 3. Update P's metadata, zero out memory and add P to the page table.
// 4. Set the page ID output parameter. Return a pointer to P.
synchronized (this){
Block block = findFreeBlock();
byte[] data = new byte[PAGE_SIZE];
diskManager.readPage(pageId,data);
block.frame = new Page(pageId, data);
block.frameId = pageId;
block.pinCount = 1;
idMap.put(pageId, block);
return block.frame;
}
}

//找一个空闲的缓冲页
//如果没有空闲缓冲页就刷一个没用的旧页作为空闲的缓冲页
public Block findFreeBlock(){
if(!freeList.isEmpty()){//如果有空闲的缓冲页
return freeList.remove(0);
}else {//如果没有就找一个
Block block = lruList.victim();
deletePg(block.frameId);
return block;
}
}

3. FetchPageImpl(page_id)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
//获取指定的页
public Page fetchPg(int pageId){
// 1. Search the page table for the requested page (P).
// 1.1 If P exists, pin it and return it immediately.
// 1.2 If P does not exist, find a replacement page (R) from either the free list or the replacer.
// Note that pages are always found from the free list first.
// 2. If R is dirty, write it back to the disk.
// 3. Delete R from the page table and insert P.
// 4. Update P's metadata, read in the page content from disk, and then return a pointer to P.
synchronized (this){
if(pageId == INVALID_PAGE_ID) return null;
//如果缓冲池里有这个页
if(idMap.containsKey(pageId)){
freeList.remove(pageId);
lruList.pin(idMap.get(pageId));
return idMap.get(pageId).frame;
} else{
return newPg(pageId);
}
}
}

4. UnpinPageImpl(page_id, is_dirty)

1
2
3
4
5
6
7
8
9
10
public void unpinPg(int pageId, boolean isDirty){
if(!idMap.containsKey(pageId)) return;
Block block = idMap.get(pageId);
if(block.pinCount == 0) return;
if(isDirty){
block.isDirty = true;
flushList.add(block);
}
lruList.unpin(block);
}

5. FlushAllPagesImpl()

1
2
3
4
5
public void flushAllPg(){
for(Block block : flushList){
flushPg(block.frameId);
}
}

初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public BufferPoolManager(DiskManager diskManager, int bufferPoolSize) {
this.diskManager = diskManager;
lruList = new LRUReplacer(bufferPoolSize);
idMap = new HashMap<>();
freeList = new LinkedList<>();
flushList = new LinkedList<>();
//this.bufferPoolSize = bufferPoolSize;
for(int i = 0; i < bufferPoolSize; i++){
Block block = new Block();
block.frame = newPg(INVALID_PAGE_ID);
idMap.put(block.frame.getPageId(), block);
freeList.add(block);//一开始所有缓冲页都是空闲的
}
}

参考

Project #1 - Buffer Pool | CMU 15-445/645 :: Intro to Database Systems (Fall 2020)

cmu-db/bustub: The BusTub Relational Database Management System (Educational) (github.com)

[已满分]CMU数据库(15-445)实验1-BufferPoolManager - 周小伦 - 博客园 (cnblogs.com)

LAOAWILLIAM/txDB: An on-disk relational database system (In progress) (github.com)

CNYuYang/BusTub: BusTub的Java ☕ 版本! (github.com)

《MySQL是怎样运行的》第17章