Dsd int 2014 - data science symposium - 4th paradigm a research perspective, prof. arnold bregt,...
-
Upload
delftsoftwaredays -
Category
Science
-
view
215 -
download
0
Transcript of Dsd int 2014 - data science symposium - 4th paradigm a research perspective, prof. arnold bregt,...
The fourth paradigm: a research
perspective
Data Science Symposium
Arnold Bregt
Outline presentation
The fourth paradigm
Your opinion
The roles of science
● Data producer role
● Data user role
● Data governance/management role
Conclusions/refection
A short introduction to.....me (Arnold)
Geo-information Science - Wageningen University
MSc Geo-information science
Research topics our group:
● Sensing and measuring
● Modelling and visualization
● Integrated land monitoring
● Human-space interaction
● Empowering communities
My field: Geo-information science
Paradigms in Science: a classification
Your opinion
Who has made new discoveries by only analysing data?
Who think we collect too much data?
Who beliefs that the fourth paradigm is a new paradigm?
The fourth paradigm
Data-intensive scientific discovery
(almost) all disciplines are more data intensive
It is a hype (“Big data”)
Papers in Scopus “fourth paradigm”
Papers in Scopus “big data”
A lot of conferences
Universities and IBM
eScience in Scopus
Marie Tharp (Oceanography)
1920- 2006
Seafloor mapping (1957)
Envisioning processes from 2D observations
Is it really new?
Data is always used by science for discovery
What is new:
● Volume
● Type of data (more spatial/temporal resolution)
● Data by “accident” or “surprice”
More for less
Data by surprise
The role of Science
Data producer role
● Past
● Present
● Future
Data user role
● ...
Data governance/management role
● ....
Data producer role: Past
Collect own data key part of research
Contextual knowledge of data
Owned by researcher (at least not claimed by university)
Data producer role: Present
Own data collection additionally to existing data (data for validation)
Data collection in communities (consortia)
Researcher compile collections (data selection) (example)
Data producer role: Future
More producer of aggregated data based on existing data (meta-analysis on data)
Role of scientist as data producer will be reduced
Validation data from small experiments
Data production as an own activity (specialist)
....
Data user role: Past
Analyse own data
Direct knowledge of data context
(even) own software for data analysis (example)
Data user role: Present
Strong increase of reuse of existing data (example)
More statistical relations (statistically different)
Less understanding of causal relations
Example
Data user role: Future
Quest for processing and visualisation algorithms
Strong increase of re-use
More “data-based” science
..
Data governance/management role: Past
Researcher manages own data
Stored in paper archives
Collections are important
Role of libraries and museum's
Data governance/management role: Present
Increased attention and institutions
Data as part of publications
DANS, 3TU.datacentrum Research data Netherlands
Data management plan (PhD’s)
Data management plan
All PhD’s must formulate DMP.
Chair groups are responsible
Critical issue from plan to implementation
Data governance/Management
The Availability of Research Data Declines Rapidly with Article Age
Timothy H. et al. 2014, Current Biology
We examined the availability of data from 516 studies between 2 and
22 years old
The odds of a data set being reported as extant fell by 17% per year
Broken e-mails and obsolete storage devices were the main obstacles
to data sharing
Policies mandating data archiving at publication are clearly needed
Data governance/management role: Future
Selection of data to be preserved
Specialist task (in close interaction with the library) “from book to data library”
Key-role for meta-data
Extent description
Target groups Functions
Manage Search Exchange Use
Personal + + - -
Own organisation/researchers ++ ++ ++ ++
Other organisation/researchers - +++ +++ +++
Conclusions/Refection
Data has always played a key-role in science
The fourth paradigm is not new, but “scale is new”
The role of the scientist is changing from primary data collection to re-use of existing data
Which means that the “data knowledge” is decreasing
The fourth
paradigm
For scientists an
evolution
(not a revolution)