Towards reusable experiments: making metadata while you measure
-
Upload
shreejoy-tripathy -
Category
Technology
-
view
262 -
download
0
description
Transcript of Towards reusable experiments: making metadata while you measure
Towards Reusable Experiments: Making Metadata While You
Measure
Shreejoy Tripathy
PhD student, Carnegie Mellon
Email: [email protected]
Twitter: @neuronJoy
Lots of great tools for data sharing…
Barriers to data sharing
• Social– “What’s in it for me? How will I get credit?”– “It’s my data, not yours”– “The benefit to me isn’t worth the time I put into it”– “What if I get scooped?”
• Methodological– “How do I share data? What do I share?”– “Going back and annotating my files to share is super-
time consuming”– Specifying file formats, data standards – Building FTP servers and nice user interfaces
Project idea
• How can we make a standard neuroscience wet lab more data-sharing savvy?
• Incorporate structured workflows into the daily practice of a typical electrophysiology lab (the Urban Lab at CMU)
– What does it take?
– Where are points of conflict?
Key insights/motivations
1. Effective data sharing includes raw data files + experimental metadata (typically stored in a lab notebook)
SDB_MC_12_voltages.mat
Key insights/motivations
1. Share raw data files + experimental metadata
2. You know the most about an experiment when you’re performing it
Key insights/motivations
1. Share raw data files + experimental metadata
2. You know the most about an experiment when you’re performing it
3. Improved data practices should make labs more productive
Project schematic
Project schematic
Metadata data app
• Electronic lab notebook models sequential slice-electrophysiology workflow – Replaces pen-and-
paper lab notebook
Metadata data entry
• Electronic lab notebook allows structured data entry
Animal Strain
Metadata data entry
• Electronic lab notebook allows structured data entry (i.e., dropdown menus)– Allows incorporation
of semantic ontologies
• Important to strike a balance between structure and flexibility
MGI:3719486
Metadata data entry
MGI:3719486
• Electronic lab notebook facilitates entry of new content, like registration of recorded neurons to brain atlas
Data integration
• Syncing of metadata app and electrophysiology data acquisition via server
– Each trace of experimental data annotated with metadata
• IGOR-Pro specific, support pClamp, other acquisition packages as needed later
Data dashboard (web-based)
Data dashboard (future-steps)
• Use collected metadata to sort experiments– Like mouse strain,
neuron type, animal age
• Enable in-browser analyses
– Track provenance of analyzed data back to raw data
Next steps
• Use built tools – Populate data server with many experiments
• Is use of e-notebook too prohibitive?– If yes, continue to iterate
– What can we ask now that we couldn’t before?• It is much easier to ask exploratory questions, like
– How is the cell type that Shawn records different from the one that Matt records?
• Exposing data to neuroscience databases– NIF, INCF Dataspace, neuroelectro.org
• How adaptable are these solutions for use in other labs?
• Who is going to pay for this?
Acknowledgements
• Carnegie Mellon
– Shreejoy Tripathy
– Nathan Urban
– Shawn Burton
– Rick Gerkin
– SantoshChandrasekaran
– Matthew Geramita
• Elsevier Research Data Services
– Anita de Waard
– Mark Harviston
– Jez Alder
– Sarah Tyrchniewicz
– David Marques
– (funding!)
Next steps
• Roll out updated app to experimentalists
• Populate database with the contents of many experiments
• Flesh out Data dashboard functionality
• Investigate the new things that we can achieve given these tools
Effective data sharing is…
• Not just experimental data file
– But also the experimental metadata: what was done? What does this variable mean? This is usually stored in PHYSICAL lab notebooks, understandable by only the experimenter
• Effective data sharing – someone who is not the person who collected the data can understand the experiment and data
App user testing
• “I don’t like the way the app forces me through a specific workflow, I want to enter experimental data when I see fit”
• “I’m not opposed to the idea of dropdowns, but I want more flexibility, more text fields”
• “When I use a lab notebook, I only write down the absolute minimum. Can the app’s fields be prepolated with the results of an old experiment?”
What is effective data sharing?
• Effective data sharing – someone who is not the person who collected the data can understand the experiment and data
– i.e., datasets should be more or less self-describing
– >90% of data sharing use cases are an experimentalist sharing data with a future version of herself or with a labmate
Neuroinformatics successes don’t come from large-scale multi-lab data
sharing• NeuroSynth
• NeuroElectro?