Towards reusable experiments: making metadata while you measure

23
Towards Reusable Experiments: Making Metadata While You Measure Shreejoy Tripathy PhD student, Carnegie Mellon Email: [email protected] Twitter: @neuronJoy

description

Slides from my short talk at INCF 2013 (neuroinformatics annual meeting) in Stockholm. I talk about realities of data sharing and a proposal to make it easier through use and adoption of electronic lab notebooks. Project a collaboration between carnegie mellon university and elsevier research data services.

Transcript of Towards reusable experiments: making metadata while you measure

Page 1: Towards reusable experiments: making metadata while you measure

Towards Reusable Experiments: Making Metadata While You

Measure

Shreejoy Tripathy

PhD student, Carnegie Mellon

Email: [email protected]

Twitter: @neuronJoy

Page 2: Towards reusable experiments: making metadata while you measure

Lots of great tools for data sharing…

Page 3: Towards reusable experiments: making metadata while you measure

Barriers to data sharing

• Social– “What’s in it for me? How will I get credit?”– “It’s my data, not yours”– “The benefit to me isn’t worth the time I put into it”– “What if I get scooped?”

• Methodological– “How do I share data? What do I share?”– “Going back and annotating my files to share is super-

time consuming”– Specifying file formats, data standards – Building FTP servers and nice user interfaces

Page 4: Towards reusable experiments: making metadata while you measure

Project idea

• How can we make a standard neuroscience wet lab more data-sharing savvy?

• Incorporate structured workflows into the daily practice of a typical electrophysiology lab (the Urban Lab at CMU)

– What does it take?

– Where are points of conflict?

Page 5: Towards reusable experiments: making metadata while you measure

Key insights/motivations

1. Effective data sharing includes raw data files + experimental metadata (typically stored in a lab notebook)

SDB_MC_12_voltages.mat

Page 6: Towards reusable experiments: making metadata while you measure

Key insights/motivations

1. Share raw data files + experimental metadata

2. You know the most about an experiment when you’re performing it

Page 7: Towards reusable experiments: making metadata while you measure

Key insights/motivations

1. Share raw data files + experimental metadata

2. You know the most about an experiment when you’re performing it

3. Improved data practices should make labs more productive

Page 8: Towards reusable experiments: making metadata while you measure

Project schematic

Page 9: Towards reusable experiments: making metadata while you measure

Project schematic

Page 10: Towards reusable experiments: making metadata while you measure

Metadata data app

• Electronic lab notebook models sequential slice-electrophysiology workflow – Replaces pen-and-

paper lab notebook

Page 11: Towards reusable experiments: making metadata while you measure

Metadata data entry

• Electronic lab notebook allows structured data entry

Animal Strain

Page 12: Towards reusable experiments: making metadata while you measure

Metadata data entry

• Electronic lab notebook allows structured data entry (i.e., dropdown menus)– Allows incorporation

of semantic ontologies

• Important to strike a balance between structure and flexibility

MGI:3719486

Page 13: Towards reusable experiments: making metadata while you measure

Metadata data entry

MGI:3719486

• Electronic lab notebook facilitates entry of new content, like registration of recorded neurons to brain atlas

Page 14: Towards reusable experiments: making metadata while you measure

Data integration

• Syncing of metadata app and electrophysiology data acquisition via server

– Each trace of experimental data annotated with metadata

• IGOR-Pro specific, support pClamp, other acquisition packages as needed later

Page 15: Towards reusable experiments: making metadata while you measure

Data dashboard (web-based)

Page 16: Towards reusable experiments: making metadata while you measure

Data dashboard (future-steps)

• Use collected metadata to sort experiments– Like mouse strain,

neuron type, animal age

• Enable in-browser analyses

– Track provenance of analyzed data back to raw data

Page 17: Towards reusable experiments: making metadata while you measure

Next steps

• Use built tools – Populate data server with many experiments

• Is use of e-notebook too prohibitive?– If yes, continue to iterate

– What can we ask now that we couldn’t before?• It is much easier to ask exploratory questions, like

– How is the cell type that Shawn records different from the one that Matt records?

• Exposing data to neuroscience databases– NIF, INCF Dataspace, neuroelectro.org

• How adaptable are these solutions for use in other labs?

• Who is going to pay for this?

Page 18: Towards reusable experiments: making metadata while you measure

Acknowledgements

• Carnegie Mellon

– Shreejoy Tripathy

– Nathan Urban

– Shawn Burton

– Rick Gerkin

– SantoshChandrasekaran

– Matthew Geramita

• Elsevier Research Data Services

– Anita de Waard

– Mark Harviston

– Jez Alder

– Sarah Tyrchniewicz

– David Marques

– (funding!)

Page 19: Towards reusable experiments: making metadata while you measure

Next steps

• Roll out updated app to experimentalists

• Populate database with the contents of many experiments

• Flesh out Data dashboard functionality

• Investigate the new things that we can achieve given these tools

Page 20: Towards reusable experiments: making metadata while you measure

Effective data sharing is…

• Not just experimental data file

– But also the experimental metadata: what was done? What does this variable mean? This is usually stored in PHYSICAL lab notebooks, understandable by only the experimenter

• Effective data sharing – someone who is not the person who collected the data can understand the experiment and data

Page 21: Towards reusable experiments: making metadata while you measure

App user testing

• “I don’t like the way the app forces me through a specific workflow, I want to enter experimental data when I see fit”

• “I’m not opposed to the idea of dropdowns, but I want more flexibility, more text fields”

• “When I use a lab notebook, I only write down the absolute minimum. Can the app’s fields be prepolated with the results of an old experiment?”

Page 22: Towards reusable experiments: making metadata while you measure

What is effective data sharing?

• Effective data sharing – someone who is not the person who collected the data can understand the experiment and data

– i.e., datasets should be more or less self-describing

– >90% of data sharing use cases are an experimentalist sharing data with a future version of herself or with a labmate

Page 23: Towards reusable experiments: making metadata while you measure

Neuroinformatics successes don’t come from large-scale multi-lab data

sharing• NeuroSynth

• NeuroElectro?