TheWeakeningandDelayed Effects ofLongTail DistributionsinBigDataAccesses
1 The Weakening and Delayed Effects of Long Tail Distributions in Big Data Accesses
Big Data and Power Law#of hitstoeachdataobjectPopularityranksforeachdataobjecTotherights(theyellowregion)isthelongtailof lower8o%objects;to the left are the few that dominate (the top 20%objects).Withlimitedspacetostoreobjectsandlimitedsearchability to a large volume of objects, most attentions and hitshave to be in the top 2o% objects, ignoring the long tail
To the rights (the yellow region) is the long tail of lower 80% objects; to the left are the few that dominate (the top 20% objects). With limited space to store objects and limited search ability to a large volume of objects, most attentions and hits have to be in the top 20% objects, ignoring the long tail. # of hits to each data object Popularity ranks for each data object Big Data and Power Law
The Change of Time (short search latency) and Space (unlimited storagecapacity)for BigDataCreatesDifferent Data AccessDistributionsTraditional longtaildistributionFlattereddistributionafterthelongtailcanbeeasilyaccessedTheheadis loweredandthetailisdropped moreandmoreslowlyIftheflattereddistributionisnotpowerlawanymore,whatisit?
Traditional long tail distribution Flattered distribution after the long tail can be easily accessed • The head is lowered and the tail is dropped more and more slowly • If the flattered distribution is not power law anymore, what is it? The Change of Time (short search latency) and Space (unlimited storage capacity) for Big Data Creates Different Data Access Distributions
DistributionChangesinDVDsinNetflix2000to201180%70%60%Lessdemandforthetop50050%40%201120002005predicted30%20%Moredemandforthe"middle'10%Longertail(15%ofdemandcomefrombeyondrank.3.000.whereandmortarretailersrun.outof.inventory)0%005TOOSST0000T0OOTTOOST0002T0050000EOOET0000OOOST00090000O0SO0059000000a0=aS=O0a2=00500008-UnO555550CTThegrowthofNetflixselections(today:30millionUSusers,40millionuserstotal,1/3streamingtrafficofInternet)2000:4.500DVDs.2005:18.000DVDs-2011:over100,000DvDs(thelongtailwouldbedroppedevenmoreslowlyformoredemands)Note:"breaksandmortarretailers":face-to-facesellshops
• The growth of Netflix selections ( today: 30 million US users, 40 million users total, 1/3 streaming traffic of Internet) – 2000: 4,500 DVDs, 2005: 18,000 DVDs – 2011: over 100,000 DVDs (the long tail would be dropped even more slowly for more demands) – Note: “breaks and mortar retailers”: face-to-face sell shops. Distribution Changes in DVDs in Netflix 2000 to 2011 2011 predicted
Amazon Case: Growth of Sales from the Changes of Time/SpaceAmazonNorthAmericaMediaSalesBarnes&NobleChainStoreSalesBordersChainStoreSales$7.0$6.0SAL$5.0ES$4.0B1$3.0LV1$2.0oN$1.0s$0.0020304050607080910Fromwwrw.fonerbooks.com/booksale.htmBN Sales shown without BN.com to contrast online vs offlineBorders &BNSalesFY ends Q1 2011,shown as2010
Amazon Case: Growth of Sales from the Changes of Time/Space