The mythology of big data presentation

22
The Mythology of Big Data O’Reilly Strata Conference February 2, 2011 Mark R. Madsen http://ThirdNature.net @markmadsen

Transcript of The mythology of big data presentation

Page 1: The mythology of big data presentation

The Mythology of Big Data

O’Reilly Strata Conference February 2, 2011

Mark R. Madsen http://ThirdNature.net @markmadsen

Page 2: The mythology of big data presentation

Everytechnologycarrieswithinitselftheseedsofitsowndestruc5on.

Page 3: The mythology of big data presentation

Codeisacommodity

http://www.flickr.com/photos/ecstaticist/1120119742/

Page 4: The mythology of big data presentation

What’sthecentralmythunderlyingbigdata?

Page 5: The mythology of big data presentation

Themyththatdrovethegoldrush

All we need is a fat pipe and pans working in parallel…

You change an org by ac.ng with, through others, not alone. 

Page 6: The mythology of big data presentation

Evolu5onofdata

50s‐60s:dataasproduct

70s‐80s:dataasbyproduct

90s‐00s:dataasasset

2010s+:dataassubstrate

The real data revolu.on is in business structure and processes and how they use informa.on. 

Page 7: The mythology of big data presentation

Everythingissodifferentnow…

Your grandmother, the data scientist.

Page 8: The mythology of big data presentation

Manycurrentapproachesmissthepoint

UsingBigData

Page 9: The mythology of big data presentation

It’snotabout“big”

UsingBigData

And “big” is often not as big as you think it is.

Page 10: The mythology of big data presentation

It’snotreallyaboutdata,either

UsingBigData

If there’s no process for applying information in a specific context then you are producing expensive trivia.

Page 11: The mythology of big data presentation

Wheredoesthevalueindatacomefrom?

Formostofusinnon‐databusinesses,thistranslatesto“How can we use informa.on to improve the decisions made in our organiza.on?” 

WeneedtofocusonthatsingularlybaddecisionmakingenDty,thegroup.

OrganizaDonsseemtoamplifyinnatedecisionmakingflaws.

Page 12: The mythology of big data presentation

Decision‐makingreali5esTheoperaDngmodelinseniormanagementisprimarilyintuiDonandpaKern‐based.ThemodeformiddlemanagementispoliDcal,bureaucraDc.Newdataisdestabilizing,whichiswhyyoumayhitawalltryingtopushyourdata‐drivenagenda.Dataiscontextual,soweneedstoriestoexplainhowwethinktheworldworks,whymydataisbeKerthanyours,andwhyyourtheorysucks.CogniDvebiascreatesamorassforinterpretaDon.

Page 13: The mythology of big data presentation

Averyabstractbusinessintelligencemodel

Whoarethepeoplemakingdecisions?

Strategic

TacDcal

OperaDonal

Page 14: The mythology of big data presentation

Whatisthenatureoftheirdecisions?

Scope,Dmeframeofdecision,Dmescaleofdata,datavolume,breadthofdata,frequency,paKernvsfact‐based

Strategic

TacDcal

OperaDonal

Months

Days‐WeeksMins‐Days

•  PaMern‐based•  Broadscope•  Fact‐based•  Moderatescope

•  Rule‐based•  Narrowscope

Ana

lytic

com

plex

ity

Page 15: The mythology of big data presentation

Theprocessaspectofdecisions5estopeopleScopeofcontrolforpeopleinmostorganizaDonsaligns:inprocess,onprocess,overprocess

Strategic

TacDcal

OperaDonal

The exceptions not handled at one level due to rule / procedure / policy deficiency are escalated to the next.

Page 16: The mythology of big data presentation

Whatkindofsupportdotheyhavetoday?

Strategic

TacDcal

OperaDonal

Other people

Email, meetings

Reports, dashboards Realm of traditional BI

Reality of most reports and dashboards is that they provide basic monitoring at best.

Page 17: The mythology of big data presentation

Strategic

TacDcal

OperaDonal

Howandwherecanyouapplydatasolu5ons?

Highsinglevalue,lessfrequent,soimprovetheeffecDvenessofindividualdecisions.

Fuzzy middle ground 

Lowsinglevalue,frequent,canimprovetheefficiencyortheeffecDvenessforlargeaggregateimprovement.

Ana

lytic

com

plex

ity

Page 18: The mythology of big data presentation

Whatdopeopledowithdata?1.  Describe:usedatatocharacterizeacurrentorpriorstateofthe

system,forexamplemonitoringandidenDfyingexcepDons

2.  Inves5gate:exploredatatodiscovertheboundariesandcharacterisDcsofasystem,frameaproblemorfindsupporDng/discrediDngevidence.

3.  Explain:usedataandanalyDcmethodstodeterminecausesandeffects,buildmodelsandconstructstories.

4.  Predict:applyanalyDcmodelstodeterminepossible/probablefuturestatesofthesystem

5.  Prescribe:usedatainmodelstodefinepolicy,procedure,andrulesfortakingacDon,andpossiblyautomatethem

Data infrastructure and tool support for these ac.vi.es in most organiza.ons is uneven at best, decreasing as you move down. 

Page 19: The mythology of big data presentation

Figure: Pirolli and Card, 2005 Effort

Structure

If you want to be a data scien1st, or build so5ware to support them, read this paper  

Page 20: The mythology of big data presentation

“Atoolmakersucceedsas,andonlyas,theusers ofhistoolssucceedwithhisaid.Howevershiningtheblade,howeverjeweledthehilt,howeverperfectthehe_,aswordistestedonlybycu`ng.Thatswordsmithissuccessfulwhoseclientsdieofoldage.”

Frederick Brooks 

Page 21: The mythology of big data presentation

About the Presenter Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, analytics and performance management. Mark is an award-winning author, architect and former CTO whose work has been featured in numerous industry publications. During his career Mark received awards from the American Productivity & Quality Center, TDWI, Computerworld and the Smithsonian Institute. He is an international speaker, contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.

Page 22: The mythology of big data presentation

About Third Nature

Third Nature is a research and consulting firm focused on new and emerging technology and practices in business intelligence, data integration and information management. If your question is related to BI, open source, web 2.0 or data integration then you‘re at the right place.

Our goal is to help companies take advantage of information-driven management practices and applications. We offer education, consulting and research services to support business and IT organizations as well as technology vendors.

We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating the products rather than vendor market positions.