Where indicated slides licensed under

Post on 03-Jan-2016

32 views 0 download

description

Dealing with software: the research data issues http://dx.doi.org/10.6084/m9.figshare.1150298 26 August 2014, Dealng with Data Conference Neil Chue Hong (@ npch ), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk. Project funding from. Supported by. - PowerPoint PPT Presentation

Transcript of Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Dealing with software:the research data issueshttp://dx.doi.org/10.6084/m9.figshare.1150298

26 August 2014, Dealng with Data ConferenceNeil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk

Where indicatedslides licensed under

Supported by Project funding from

Software Sustainability Institute

www.software.ac.uk

“Re-” is the new black

Software Sustainability Institute

www.software.ac.uk

The Research Cycle

Create

Test

Interpret

PublishRevise Paper

Data

Software

Research Outputs Research is a continuous cycle.

When we publish we are contributing to the body of knowledge.

Software Sustainability Institute

www.software.ac.uk

Research/Reuse/Reward Cycle

Index

Identify

CiteRewardCreate

Test

Interpret

PublishRevise

Research Reuse Reuse is also a cycle. We build our research on the work of others.

Reward mechanisms should encourage reuse.

Software Sustainability Institute

www.software.ac.uk

The current process

Startresearch

Writesoftware

Usesoftware

Produceresults

Publishresearch

paper

Releasedata

Releasesoftware

Which mentions software and data

This process is simple but does not reward production orreuse of good software and data.

It also has a long contribution cycle.

Software Sustainability Institute

www.software.ac.uk

“Re-”positoriesBackup|Sharing|Archivingof software

Software Sustainability Institute

www.software.ac.uk

Differing roles, different repositories

backup sharing archiving

TimescalesPolicyLicensing

IngestMetadataAssurance

Software Sustainability Institute

www.software.ac.uk

Versioning

Personalv1

Personal v2

Personalv3

Personal v2a

Public v1

Personal v3a

Personal v2a

Public v2

Public v3

Why do we version?- To indicate a change- To allow sharing- To confer special status

Version control systems make this easy and conceptof a person and an outputare there but not unique

Software Sustainability Institute

www.software.ac.uk

Algorithm

Function

Prog

ram

Library / Suite / Package

Granularity

What do we define?- Useful units of reuse

Software Sustainability Institute

www.software.ac.uk

What do we choose to identify:- Workflow?- Software that runs workflow?- Software referenced by workflow?- Software dependencies? What’s the minimum citable part?

Boundary

Software Sustainability Institute

www.software.ac.uk

AuthorshipAuthorship• Which authors have had what impact on each version of the software?• Who had the largest contribution to the scientific results in a paper?• Can micro-attribution work? Can track author, but not contribution?

http://beyond-impact.org/?p=175

OGSA-DAI projects statistics from Ohloh

Why do we identify?- To measure- To restrict- To communicate- To include

Software Sustainability Institute

www.software.ac.uk

Code as a Research Object

• What if you could assign DOIs to code easily?

• Could we make software more reusable?• http://mozillascience.org/code-as-a-research-object-a-new-project/• https://guides.github.com/activities/citable-code/

Software Sustainability Institute

www.software.ac.uk

Writesoftware

A better process?

Startresearch

Identifyexisting

software

Usesoftware

Produceresults

Publishresearch

paper

Adapt/extend

software

Releasedata

Releasesoftware

Publishsoftware

paper Publishdata

paper

Which references

softw

are and data papers

Software and data papers are needed as proxies for rewarding reuse.

But it enables a shorter contribution cycle for data and software.

Software Sustainability Institute

www.software.ac.uk

Alternative Metrics

Software Sustainability Institute

www.software.ac.uk

One-click challenge

• “One-click” archiving of a significant version of software in a code repository to a suitable institutional repository

• “Suitable” repository: Clear access / deposit / preservation policy Adherence to standards Ability to easily “transfer” in / out Allows use of appropriate licenses for code Sustainability of hosting organisation Ability to monitor, check integrity Provides permanent unique identifiers

• Proposing a hackday to make this happen

Software Sustainability Institute

www.software.ac.uk

Summary

• Software is an important output of the research cycle, and should be rewarded

• Repositories play an important role in the research cycle, including software

• But software has specific issues with regards to research data management

• Tooling is needed to lower barriers to deposit

Software Sustainability Institute

www.software.ac.uk

Further information

• This presentation: Slides: http://dx.doi.org/10.6084/m9.figshare.1150298 Abstract: http://dx.doi.org/10.6084/m9.figshare.1150299

• Where does it go from here: the place of software in digital repositories http://www.research.ed.ac.uk/portal/en/publications/where

-does-it-go-from-here-the-place-of-software-in-digital-repositories(ab6130c6-aee6-4972-9256-8ea0eb1862c9).html

• Software Papers: improving the reusability and sustainability of scientific software http://dx.doi.org/10.6084/m9.figshare.795303

• Software Sustainability Institute http://www.software.ac.uk/ Supported by EPSRC

Grant EP/H043160/1