BufferCacheinLSM-treeDRAMHDDCoCigetaindexstructurea-fabcdefrootmulti-paged-fa-cblockaQueriesareconducted levelbylevelbu'ffercacheIneachlevel,anindexismaintainedinmemorytomapkeystodiskblocksTheindexischeckedtolocatethediskdiskblockforthekey/keyrangesingle-pagedeltThe found disk block(s) will be loadedabcblockintobuffercacheandservethere11
a b c d e f disk Buffer Cache in LSM-tree HDD C0 C1 C2 DRAM a b c d e f a-f a-c d-f a buffer cache root multi-page block single-page block index structure get a • Queries are conducted level by level • In each level, an index is maintained in memory to map keys to disk blocks • The index is checked to locate the disk block for the key/key range • The found disk block(s) will be loaded into buffer cache and serve there 11
BufferCacheinLSM-treeDRAMHDDCoCigetaindexstructurea-fabcdefrootmulti-paged-fa-cblockaAny future request on thebuffercachesame key/key range still needto go through the indexdiskThe found disk block(s) canbeservedfromthebuffersingle-pagedeltabcblockcache directly after then12
a b c d e f disk Buffer Cache in LSM-tree HDD C0 C1 C2 DRAM a b c d e f a-f a-c d-f a buffer cache root multi-page block single-page block index structure get a • Any future request on the same key/key range still need to go through the index • The found disk block(s) can be served from the buffer cache directly after then 12
LSM-treeinduced BufferCacheinvalidationsBuffercachewritesDRAMHDDdiskCCaeecabcdefmerqesaebdbcdefbdf1.2Read buffer (buffer cache)and0.8LSM-tree write buffer (Co) are0.6separate兰0.4Frequentcompactionsforsorting0.2-Referencingaddresseschanged055006000450050006500-Cacheinvalidations=>missestime(s)13
a b c d a c b a d LSM-tree induced Buffer Cache invalidations 13 • Read buffer (buffer cache) and LSM-tree write buffer (C0) are separate • Frequent compactions for sorting – Referencing addresses changed – Cache invalidations => misses Buffer cache a e b d f a b c d e f HDD writes merges C0 C1 C2 DRAM a c e a e b d f c a c e a b c d e f disk
Existing representative solutions· Building a Key-Value store cacheE.g.rawcacheinCassandra,RocksDB,Mega-KV(VLDB2015)·Providing a Dedicated Compaction Server"Compaction management in distributedkey-value data stores(VLDB' 2015) Lazy Compaction: e.g., stepped merge“lncremental OrganizationforData Recording and Warehousing(VLDB 1997)15
Existing representative solutions • Building a Key-Value store cache – E.g. raw cache in Cassandra, RocksDB, Mega-KV (VLDB 2015) • Providing a Dedicated Compaction Server – “Compaction management in distributed key-value data stores” (VLDB’ 2015) • Lazy Compaction: e.g., stepped merge – “Incremental Organization for Data Recording and Warehousing” (VLDB’ 1997) 15