Thedisksin2000are57timesSLOWER"thantheirancestors in 1980 --- increasingly widen the Speed GapAdiskaccesshasa4milliontimedelayoveracachehit5,000,000500000045000004000000350000030000001,666,66625000002000000560,0001500000451,80787,0001.2100000011.660.30.370.70.92.55000001.25019801985199019952000YearSRAMAccessTimeDRAMAccessTimeDiskSeekTimeBryant andO'Hallaron,"Computer Systems:A Programmer's Perspective"6PrenticeHall,2003
6 0.3 0.37587,000 0.9 1.2 451,807 0.7 2 560,000 2.5 11.66 1,666,666 1.25 37.5 5,000,000 0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 5000000 CPU Cycles 1980 1985 1990 1995 2000 Year Latencies of Cache, DRAM and Disk in CPU Cycles SRAM Access Time DRAM Access Time Disk Seek Time Unbalanced System Improvements: A disk perspective Bryant and O’Hallaron, “Computer Systems: A Programmer’s Perspective”, Prentice Hall, 2003 The disks in 2000 are 57 times “SLOWER” than their ancestors in 1980 - increasingly widen the Speed Gap A disk access has a 4 million time delay over a cache hit
1to100MillionsTimesDelayTodayforDiskAccesses100.000.000MemoryLatency(ns)10.000.0001.100.000100.0001.0000-DRNFiashcachTeoTOTieryoanncBtoraeStorageFOoraaeTierO storage:high-enddisksconnected byfastswitches fortransactional dataTier1 storage:SATAdiskarraysformissioncriticaldataTier2storage:DiskarraysforseldomusedarchiveddataJeffRichardson,"BridgingtheI/OGap",TheDataCenterJournal,20127
7 1 to 100 Millions Times Delay Today for Disk Accesses Tier 0 storage: high-end disks connected by fast switches for transactional data Tier 1 storage: SATA disk arrays for mission critical data Tier 2 storage: Disk arrays for seldom used archived data Jeff Richardson, “Bridging the I/O Gap”, The Data Center Journal, 2012
TechnologyAdvancementsin45years: Single-core CPU reached its peak performance1971(2300transistors0nlntel4004chip):0.4MHz 2005 (1 billion + transistors on IntelPentium D):3.75GHz-After10,oo0timesimprovement,GHzstoppedanddropped- CPU improvement will be reflected by number of cores in a chip.Increased DRAM capacity enables largeworking sets- 1971 ($400/MB)to2014(0.75cent/MB):a reductionof533,333times-In-memory computingis a reality:SSDs (flash memory) can further reduce the access latency- Non-volatile device with limited write life (can be an independent disk)-Lowpower (6-8Xlowerthandisks,2X lowerthanDRAM)- Fast random read (200X fasterthan disks,25X slower thanDRAM)8
8 Technology Advancements in 45 years • Single-core CPU reached its peak performance – 1971 (2300 transistors on Intel 4004 chip): 0.4 MHz – 2005 (1 billion + transistors on Intel Pentium D): 3.75 GHz – After 10,000 times improvement, GHz stopped and dropped – CPU improvement will be reflected by number of cores in a chip • Increased DRAM capacity enables large working sets – 1971 ($400/MB) to 2014 (0.75 cent/MB): a reduction of 533,333 times – In-memory computing is a reality • SSDs (flash memory) can further reduce the access latency – Non-volatile device with limited write life (can be an independent disk) – Low power (6-8X lower than disks, 2X lower than DRAM) – Fast random read (200X faster than disks, 25X slower than DRAM)
Data-IntensiveScalableComputing(DiSC)MassivelyAccessing/ProcessingData Sets inFast Speed> An initial big data report, endorsed by Industries: Intel,Google, Microsoft, Sun, and scientists in many areas.>Applications in science, industry, and business.Special requirementsforDisCInfrastructure:Top 50o DISC ranked by data throughput, as well FLOPSFrequent interactions between parallel CPUs anddistributed storages. Scalability is challenging> DiSC is not an extension of SC, but demands newtechnology advancements.9
9 Data-Intensive Scalable Computing (DISC) ❑ Massively Accessing/Processing Data Sets in Fast Speed ➢ An initial big data report, endorsed by Industries: Intel, Google, Microsoft, Sun, and scientists in many areas. ➢Applications in science, industry, and business. ❑ Special requirements for DISC Infrastructure: ➢ Top 500 DISC ranked by data throughput, as well FLOPS ➢ Frequent interactions between parallel CPUs and distributed storages. Scalability is challenging. ➢ DISC is not an extension of SC, but demands new technology advancements
Systems Comparison: (courtesy of BryantDISCConventional ComputersSystemSystem-Diskdatastoredseparately- System collects and·Nosupportforcollectionormaintains datamanagement: Shared, active data setBrought in for computation- Computation co-located.Timeconsumingwithdisks.Limits interactivity·Fasteraccess10
10 Systems Comparison: (courtesy of Bryant) – Disk data stored separately • No support for collection or management – Brought in for computation • Time consuming • Limits interactivity – System collects and maintains data • Shared, active data set – Computation co-located with disks • Faster access System System Conventional Computers DISC