The mythology of big data presentation

Post on 15-Jun-2015

220 views 1 download

Transcript of The mythology of big data presentation

The Mythology of Big Data

O’Reilly Strata Conference February 2, 2011

Mark R. Madsen http://ThirdNature.net @markmadsen

Everytechnologycarrieswithinitselftheseedsofitsowndestruc5on.

Codeisacommodity

http://www.flickr.com/photos/ecstaticist/1120119742/

What’sthecentralmythunderlyingbigdata?

Themyththatdrovethegoldrush

All we need is a fat pipe and pans working in parallel…

You change an org by ac.ng with, through others, not alone. 

Evolu5onofdata

50s‐60s:dataasproduct

70s‐80s:dataasbyproduct

90s‐00s:dataasasset

2010s+:dataassubstrate

The real data revolu.on is in business structure and processes and how they use informa.on. 

Everythingissodifferentnow…

Your grandmother, the data scientist.

Manycurrentapproachesmissthepoint

UsingBigData

It’snotabout“big”

UsingBigData

And “big” is often not as big as you think it is.

It’snotreallyaboutdata,either

UsingBigData

If there’s no process for applying information in a specific context then you are producing expensive trivia.

Wheredoesthevalueindatacomefrom?

Formostofusinnon‐databusinesses,thistranslatesto“How can we use informa.on to improve the decisions made in our organiza.on?” 

WeneedtofocusonthatsingularlybaddecisionmakingenDty,thegroup.

OrganizaDonsseemtoamplifyinnatedecisionmakingflaws.

Decision‐makingreali5esTheoperaDngmodelinseniormanagementisprimarilyintuiDonandpaKern‐based.ThemodeformiddlemanagementispoliDcal,bureaucraDc.Newdataisdestabilizing,whichiswhyyoumayhitawalltryingtopushyourdata‐drivenagenda.Dataiscontextual,soweneedstoriestoexplainhowwethinktheworldworks,whymydataisbeKerthanyours,andwhyyourtheorysucks.CogniDvebiascreatesamorassforinterpretaDon.

Averyabstractbusinessintelligencemodel

Whoarethepeoplemakingdecisions?

Strategic

TacDcal

OperaDonal

Whatisthenatureoftheirdecisions?

Scope,Dmeframeofdecision,Dmescaleofdata,datavolume,breadthofdata,frequency,paKernvsfact‐based

Strategic

TacDcal

OperaDonal

Months

Days‐WeeksMins‐Days

•  PaMern‐based•  Broadscope•  Fact‐based•  Moderatescope

•  Rule‐based•  Narrowscope

Ana

lytic

com

plex

ity

Theprocessaspectofdecisions5estopeopleScopeofcontrolforpeopleinmostorganizaDonsaligns:inprocess,onprocess,overprocess

Strategic

TacDcal

OperaDonal

The exceptions not handled at one level due to rule / procedure / policy deficiency are escalated to the next.

Whatkindofsupportdotheyhavetoday?

Strategic

TacDcal

OperaDonal

Other people

Email, meetings

Reports, dashboards Realm of traditional BI

Reality of most reports and dashboards is that they provide basic monitoring at best.

Strategic

TacDcal

OperaDonal

Howandwherecanyouapplydatasolu5ons?

Highsinglevalue,lessfrequent,soimprovetheeffecDvenessofindividualdecisions.

Fuzzy middle ground 

Lowsinglevalue,frequent,canimprovetheefficiencyortheeffecDvenessforlargeaggregateimprovement.

Ana

lytic

com

plex

ity

Whatdopeopledowithdata?1.  Describe:usedatatocharacterizeacurrentorpriorstateofthe

system,forexamplemonitoringandidenDfyingexcepDons

2.  Inves5gate:exploredatatodiscovertheboundariesandcharacterisDcsofasystem,frameaproblemorfindsupporDng/discrediDngevidence.

3.  Explain:usedataandanalyDcmethodstodeterminecausesandeffects,buildmodelsandconstructstories.

4.  Predict:applyanalyDcmodelstodeterminepossible/probablefuturestatesofthesystem

5.  Prescribe:usedatainmodelstodefinepolicy,procedure,andrulesfortakingacDon,andpossiblyautomatethem

Data infrastructure and tool support for these ac.vi.es in most organiza.ons is uneven at best, decreasing as you move down. 

Figure: Pirolli and Card, 2005 Effort

Structure

If you want to be a data scien1st, or build so5ware to support them, read this paper  

“Atoolmakersucceedsas,andonlyas,theusers ofhistoolssucceedwithhisaid.Howevershiningtheblade,howeverjeweledthehilt,howeverperfectthehe_,aswordistestedonlybycu`ng.Thatswordsmithissuccessfulwhoseclientsdieofoldage.”

Frederick Brooks 

About the Presenter Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, analytics and performance management. Mark is an award-winning author, architect and former CTO whose work has been featured in numerous industry publications. During his career Mark received awards from the American Productivity & Quality Center, TDWI, Computerworld and the Smithsonian Institute. He is an international speaker, contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.

About Third Nature

Third Nature is a research and consulting firm focused on new and emerging technology and practices in business intelligence, data integration and information management. If your question is related to BI, open source, web 2.0 or data integration then you‘re at the right place.

Our goal is to help companies take advantage of information-driven management practices and applications. We offer education, consulting and research services to support business and IT organizations as well as technology vendors.

We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating the products rather than vendor market positions.