Migrated from Slack
- How can I document genome size estimates from:
- flow cytometry (from pleopod and/or gill tissue of snap frozen organisms)
- Feulgen imaging (from pleopod or tail tissue from both snap frozen samples and ethanol preserved samples)
- Which fields should I use to publish ddRAD sequencing data? Do we have an example dataset like this please?
@Saara Hi Yi-Ming! I have to admit these are questions that I haven’t come across before. At the moment the use of the extension is described for metabarcoding and qPCR only, so I don’t have direct answers. I would also like to hear what type of information is most important for this data, are these questions from a data provider? Would they be interested/available to look into this together?
Possibly we can look into what other standards (e.g. gsc) are providing for this data, and see if we can integrate those fields to the dna-derived data extension.
@ymgan Thank you so much @Saara! You asked a good question. Frankly, I don’t know what type of information is most important for this data, nor which of these are important for OBIS. These are questions from me because our data provider is not familiar with Darwin Core. It is my first time getting this type of information. The data is about specimens (known taxa) with COI sequences (for phylogeny and haplotypes network) and a subset of them with ddRAD sequencing to assess population structure and connectivity of different populations of certain cryptic species around Antarctica. Sure, I can connect them with you. Thanks for the offer, I hope we can meet and find a solution together.
@sformel For the genome size estimates, I would default to EMoF using C-value (or genome size) as the measurementType. Then I would encourage publication to Genome Size DBs like:
For ddRAD, the most important thing is that the data be cross-linked with INSDC archives of the assembled seqs. It’s also sounds like organismID, or other organism-specific metadata will be important for downstream users to reconstruct phylogenies and networks.
IMO, the conversation about what processed information for ddRAD is probably very similar to MAGs. The conversation about what metadata should be shared in biodiversity platforms is still evolving (i.e. we kicked that can down the road): Publishing DNA-derived data through biodiversity data platforms
Would you mind also raising these in the GBIF MDT discourse? It’s cool that you have examples to work with, that will help us figure it out: For matters relating to the Metabarcoding Data Toolkit or the datatype in general - Data Publishing - GBIF community forum