INSDC aims to increase significantly the number of sequences for which the origin of the sample can be precisely located in time and space. We will achieve this through harmonisation of accurate geographical annotation and time of collection information. Our ultimate goal is to ensure spatio-temporal annotation is collected for all new incoming sequences by the end of 2022.
Over the next year, you can expect to see INSDC databases starting to put in place additional requirements for new sequence submissions. For example, in future we expect to require that all new submissions include for each sample:
The country or region (from https://www.insdc.org/country.html) where the sample was collected, using standardised country names from a controlled list
The collection date of the sample, recording at least the year of collection
Submitters will be able to indicate cases where these fields are not relevant, such as for an established cell line, using a controlled vocabulary.
We feel this is a sensible minimal measure that will improve the utility of INSDC sequence data that all submitters should be able to provide. We will work with our user community to ensure they can be ready to make this change.
As an immediate next step, we would like user feedback as to whether these measures will prove feasible for you to implement, or whether you feel this change may be difficult to adopt in the near term. Please provide your feedback to the INSDC member database to which you normally submit:
DDBJ: please email email@example.com
ENA (EMBL-EBI): please email firstname.lastname@example.org
GenBank and SRA (NCBI): please email email@example.com
We will release further details of these changes on or before the 1st of April 2022.