Plant Communities at the Eden Project, UK, derived from soil eDNA (Q7963)

From MaRDI portal
Dataset published at Zenodo repository.
Language Label Description Also known as
English
Plant Communities at the Eden Project, UK, derived from soil eDNA
Dataset published at Zenodo repository.

    Statements

    0 references
    The project seeks to understand the potential for the use of eDNA collected from soil to characterise plant communities. To do so, soils were sampled at the Eden Project in Cornwall UK, within the two covered biomes where we have a good understanding of the structure and composition of plant communities (further quantified with above ground plant coverage inventories).32 plots were established across 10 different plant assemblages, each of which experiences subtle differences in soil chemistry and microclimate. Each plot consists of a 2 x 2 m quadrat, with four soil aggregates collected at each corner. eDNA was then extracted and amplified following the methods detailed in Zinger et al. (2016) and Donald et al. (2021). The primers used targeted the P6 loop of thechloroplastic trnL intron [primer_fwd: GGGCAATCCTGAGCCAA, primer_rev: CCATTGAGTCTCTGCACCTATC] (Taberlet et al. 2007). 16 Extraction, 54 Sequencing, and 16 PCR controls are included so as to account for potential errors generated during the processing of samples, with a mock community (4 positive controls) of 10 known plant sequences also included to guide filtering thresholds.PCR products were pooled and sequencing libraries were constructed using the Illumina TruSeq NanoPCRFree kit following the suppliers instructions (Illumina Inc., San Diego, California, USA), except that the ligation product was not PCR amplified to limit tag-jump biases (Taberlet et al 2018). The libraries were then sequenced on an Illumina Hiseq platform(San Diego, CA, USA). Sequencing was conducted by the GenoToul bioinformatics platform (Toulouse, France), with the OBITOOLS package (Boyer et al. 2016). Here, the produced sequence data was processed using the following steps. First, illuminapairedend was used to assemble paired-end reads. This algorithm is based on an exact alignment algorithm that considers the quality scores at all positions during the assembly process. Subsequently, we used the ngsfilter command to identify and remove the primers and tags on each read, and assign reads to their respective samples (NGS filter file provided: ngsfilter_TRNL_PLANTS_EDEN_PROJECTb.txt). This program was used with its default parameters tolerating two mismatches for each of the two primers and no mismatch for the tags. Following this, sequencing reads were dereplicated using the obiuniq command. The produced data.uniq.fasta file is supplied here. Sequences were then further filtered to remove sequences of low quality (containing Ns or with paired-end alignment scores below 50), and sequences represented by only one read (singletons)using the obigrep command. To remove PCR/sequencing errors as well as intraspecific variability, we built OTUs (Operational Taxonomic Units) using the sumaclust clustering algorithm (Mercier et al. 2013), which considers the most abundant sequence of each cluster as the cluster representative. OTUs were set at a sequence similarity threshold of95%. To assign a taxon to plantOTUs, we built areference sequence database using the ecoPCR programme (Ficetola et al. 2010) on the European Molecular Biology Laboratory (EMBL; release 141).OTUs were then assigned a taxonomy, using OBITOOLs ecotag programme (Boyer et al. 2016), which performs a global alignment of each OTU sequence (the query) against each reference. The reference taxon assigned to each OTU corresponds to the Last Common Ancestor of all the best-match sequences for the query. Datasets were subsequently filtered to remove contaminants as well as artefacts such as PCR chimeras and remaining sequencing errors, using routines implemented in the metabaR R package (Zinger et al 2021), in R version 3.6.1 (R Development Core Team, 2013).The filtering process consisted of four steps: (i) a negative control-based filtering. OTUs whose maximum abundance was found in extraction/PCR negative controls were removed from the dataset, as they were likely to be reagent/aerosol contaminants, better amplified in the absence of competing DNA fragments as it is the case in biological samples. (ii) a reference-based filtering. OTUs which are too dissimilar from sequences available in reference databases are potential chimeras generated during sequencing and amplification. In this study, we chose to set similarity thresholds at 100%. (iii) an abundance-based filtering. This procedure targets incorrect assignment of a few numbers of sequences corresponding to true OTUs occurring to the wrong sample, a phenomenon called tag-switching. It consists in setting OTUs abundances to 0 in samples where their abundance represents 0.03% of the total OTU abundance in the entire dataset. (iv) Finally, we conducted a PCR-based filtering by considering any PCR reaction that yielded less than 1000 readsas non-functional, and removed them from the dataset. The script used for implementing this is provided (metabaR_Eden_Plants_100sim.html), with sequence data processed to remove contaminants, OTUs of low taxonomic resolution, and PCRs with too low a read count. The clean data is provided (eden_plant_postclean_100sim.rds). References: Boyer, F. et al. (2016) obitools: a unix-inspired software package for DNA metabarcoding, Molecular Ecology Resources, 16(1), pp. 176182. doi:10.1111/1755-0998.12428. Donald, J.et al. (2021) Multi-taxa environmental DNA inventories reveal distinct taxonomic and functional diversity in urban tropical forest fragments.Global Ecology and Conservation29 (2021): e01724. Mercier, C. et al. (2013) SUMATRA and SUMACLUST: fast and exact comparison and clustering of sequences, in Programs and Abstracts of the SeqBio 2013 workshop. Abstract. Citeseer, pp. 2729. Taberlet, P. et al. (2007) Power and limitations of the chloroplast trn L (UAA) intron for plant DNA barcoding, Nucleic Acids Research, 35(3), pp. e14e14. doi:10.1093/nar/gkl938. Taberlet, P. et al. (2018) Environmental DNA: For Biodiversity Research and Monitoring. Oxford University Press. Team, R.C. (2013) R: A language and environment for statistical computing. Vienna, Austria. Zinger, L. et al. (2016) Extracellular DNA extraction is a fast, cheap and reliable alternative for multi-taxa surveys based on soil DNA, Soil Biology and Biochemistry, 96, pp. 1619. Zinger, L. et al. (2021) metabaR: An r package for the evaluation and improvement of DNA metabarcoding data quality, Methods in Ecology and Evolution. DOI:https://doi.org/10.1111/2041-210X.13552
    0 references
    12 October 2021
    0 references
    0 references
    0 references
    0 references

    Identifiers

    0 references