Microscopy (meta)data management

Julio Mateos Langerak, Guillaume Gay

MiFoBio 2023

Julio Mateos Langerak julio.mateos-langerak@igh.cnrs.fr
Guillaume Gay guillaume.gay@lirmm.fr
omero-fbi.fr/slides/mifobio2023/main.html

Why bother?

You have to anyway

  • Publications will ask for raw data anyway
  • Funding bodies require it
  • It’s the law

«La recherche publique a pour objectifs : (…) e) L’organisation de l’accès libre aux données scientifiques.»

Open & FAIR science is good for you

  • Data is easier to search, longer term usefulness
  • Better traceability: recognition of your work (including facility personnel).
  • Articles with open data are more cited.
  • Easier collaborations, notably with image analysts.
  • Indirectly: Data re-use helps image analysis

Open science is good science

  • Peer pressure implies attention to metadata and data quality 😋
  • FAIR data eases reproducibilty
  • Clearer link between data and protocols

Ethical considerations

  • Highly performent, extremly sophisticated instruments
  • Highly trained operators (you 😉)
  • ➡️ We are at a golden age for microscopy

the data you create is extremely valuable, and deserves some effort towards its long-term preservation as heritage.

Definitions

Data and Metadata

In computer science, data (treated as singular, plural, or as a mass noun) is any sequence of one or more symbols; datum is a single symbol of data. Data requires interpretation to become information. wikipedia

Metadata […] is “data that provides information about other data” wikipedia

As scientific data is complex, good metadata is essential to contextualize and interpret data.

Open Science

Open science is the movement to make scientific research (including publications, data, physical samples, and software) and its dissemination accessible to all levels of society, amateur or professional. wikipedia

UNESCO open science

Data Life Cycle

See RDM kit

Data life cycle

Data Management Plan

  • The DMP is used to think about a project’s data
  • An occasion to discuss with the facility people & anticipate
  • An evolving document.

maDMP (machine actionable DMPs)

  • A DMP that can interact with software
  • Trigger resource booking
  • Harvest metadata

DMP edition tools

The FAIR principles

  • Findable
  • Accessible
  • Interoperable
  • Reusable

See meaning and details here: FAIR principles

PID

  • PID means Persistent Identifier
  • Unique “address” pointing to some data.
  • An article DOI
  • Someone’s ORCID
  • An email
  • personnal web page

Linked Data

  • From semantic web approaches

Licences

CC-BY

Overview of the FAIR data management landscape

Internationaly

Resources

In France

For microscopy

REMBI

What is “good enough” metadata for Biological Images?

Recommended Metadata for BioImages:

REMBI metadata categories

QUAREP-LiMi

Exhaustive instrumental metadata - NBO-Q

ISA Framework

The Investigation Study Assay Framework is a spectification to organize research data hierarchically

isa hierachy
isa example

Practice

DMP opidor quick tour

Data Stewardship Wizard, structure DMP

File naming and hierarchy

Filling REMBI metadata

Searching ontologies

Submitting to BioImage Archive

Submitting to IDR