BuildingHigh-PerformanceandCost-EffectiveStorageSystemswithFlashMemorybasedSolidStateDrives
Building High-Performance and Cost-Effective Storage Systems with Flash Memory based Solid State Drives
2Evolution of Storage and new DemandHard diskdrive (HDD)PerformnaceGapMajorstoragedevicesince19566.E+065.E+06Merits4.E+06Largecapacity,lowcostDRAMMostcommonlyusedstorage3.E+06DISKssaoo2.E+06MechanicalNature1.E+06Unsatisfactoryperformance0.E+00Highpowerconsumption19801985199020001995Source:BryantandO'Hallaron,"ComputerSystems:AProgrammer'sPerspective",PrenticeHall,20031956:IBM305RAMACcomputer1973:IBM33402007:Hitachi GSTEmerging andwithharddisk(5MB/1,200RPM)35-70MBDeskstar7K1000,1st1TBFuture
Track 0 Track 1 Track c – 1 Sector Recording area Spindle Direction of rotation Platter Read/write head Actuator Arm Track 2 Sourc e: C omputer Architecture, M emory System D esign, B . P arhami, U CSB 0.E+00 1.E+06 2.E+06 3.E+06 4.E+06 5.E+06 6.E+06 1980 1985 1990 1995 2000 Access Time in Cycles Performnace Gap DRAM DISK Source: Bryant and O’Hallaron, “Computer Systems: A Programmer’s Perspective", Prentice Hall, 2003 Evolution of Storage and new Demand 2 1956: IBM 305 RAMAC computer with hard disk (5MB/1,200RPM) 1973: IBM 3340 35-70MB 2007: Hitachi GST Deskstar 7K1000, 1st 1TB • Hard disk drive (HDD) – Major storage device since 1956 • Merits – Large capacity, low cost – Most commonly used storage • Mechanical Nature – Unsatisfactory performance – High power consumption Emerging and Future SSD -100x
3Highpower/latencyfrommechanicaloperationsRelativeSizeofDiskVoTimeComponentsfarRandomivoSCSITrarsfer3%Seek157%Rotaion(RPM27%IntermelTranstertoEmbeddedDiskConrollerOther7%6%Adominantdiskaccesstimecomesfrommechanical operations- Seek (57%)+ rotation (27%)+ data fetch (7%)+ other overhead (6%) = 97%Datatransfertimeonly3%Source:Configuration and CapacityPlanningforSolaris Servers,SunMicrosystems
High power/latency from mechanical operations 3 • A dominant disk access time comes from mechanical operations – Seek (57%) + rotation (27%) + data fetch (7%) + other overhead (6%) = 97% – Data transfer time only 3% Source: Configuration and Capacity Planning for Solaris Servers, Sun Microsystems
A Scientific Discovery started a Revolution in DisksGiantMagneto-resistence(GMR)wasdiscoveredin1988- By Peter Gruenberg (Germany) and Albert Fert (France)-Giant resistancechanges in materials made ofalternating and verythin (nanometer thin)layerswhenexposedtomagneticfields.This discovery lays a foundation to increase the HDD densityFirstGMRbasedcommercialHDDof16GBbyIBMappearedin1997 Starting 2007, 1,000 +GB (TaraBytes) HDDs are available in the marketNextgenerationfastandhighdensitymemory:MagnetoresistiveRAM-GruenbergandFertreceivedthe2007PhysicsNobelPrizeforGMR
A Scientific Discovery started a Revolution in Disks • Giant Magneto-resistence (GMR) was discovered in 1988 – By Peter Gruenberg (Germany) and Albert Fert (France) – Giant resistance changes in materials made of alternating and very thin (nanometer thin) layers when exposed to magnetic fields. – This discovery lays a foundation to increase the HDD density – First GMR based commercial HDD of 16 GB by IBM appeared in 1997. – Starting 2007, 1,000 +GB (TaraBytes) HDDs are available in the market – Next generation fast and high density memory: Magnetoresistive RAM – Gruenberg and Fert received the 2007 Physics Nobel Prize for GMR 4
5Evolution of the 5 Minute RuleFirst version: Jim Gray and Franco Putzolu (1987, SIGMOD)-Background:diskcapacityislowandexpensive,latencyisnotanissue- Accessing I KB data in disk costs $2,0oo, but only $5 in main memory-Rule:pagesreferenced every5minutes shouldbememory residentSecondversion:JimGrayandP.Shenoy (2000,ICDE)-Background:capacity is up 1,oo0x, bandwidth only 40x,very low price- 5 minute rule becomes a caching rule for performance due to:- (1) Diskaccesses slow 10X per decade; (2)disk scanning time increasesArecentversion:G.Graefe(CACM,2009)Background:ssDisstillexpensive,diskspaceisalmostfree,lowspeedForsmallsizeblocks,5minuteruleholdsbetweenDRAM/SSDForaverylargesizeblocks,5minuteruleholdsbetweenSsD/disks
Evolution of the 5 Minute Rule • First version: Jim Gray and Franco Putzolu (1987, SIGMOD) – Background: disk capacity is low and expensive, latency is not an issue – Accessing I KB data in disk costs $2,000, but only $5 in main memory – Rule: pages referenced every 5 minutes should be memory resident • Second version: Jim Gray and P. Shenoy (2000, ICDE) – Background: capacity is up 1,000x, bandwidth only 40X, very low price – 5 minute rule becomes a caching rule for performance due to: – (1) Disk accesses slow 10X per decade; (2) disk scanning time increases • A recent version: G. Graefe (CACM, 2009) – Background: SSD is still expensive, disk space is almost free, low speed – For small size blocks, 5 minute rule holds between DRAM/SSD – For a very large size blocks, 5 minute rule holds between SSD/disks 5