Selection of a marker gene to construct a reference library for wetland plants, and the application of metabarcoding to analyze the diet of wintering herbivorous waterbirds

Yuzhan Yang; Aibin Zhan; Lei Cao; Fanjuan Meng; Wenbin Xu

doi:10.7717/peerj.2345

Selection of a marker gene to construct a reference library for wetland plants, and the application of metabarcoding to analyze the diet of wintering herbivorous waterbirds

Yuzhan Yang¹, Aibin Zhan², Lei Cao ², Fanjuan Meng², Wenbin Xu³

1School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China

2Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China

3Anhui Shengjin Lake National Nature Reserve Administration, Chizhou, Anhui, China

DOI: 10.7717/peerj.2345

Published: 2016-08-17
Accepted: 2016-07-19
Received: 2016-04-12

Academic Editor: Michael Somers

Subject Areas: Animal Behavior, Conservation Biology, Ecology, Molecular Biology, Veterinary Medicine
Keywords: Bean goose, Greater white-fronted goose, Diet analysis, trnL, Molecular reference library, Metabarcoding

Copyright: © 2016 Yang et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Yang Y, Zhan A, Cao L, Meng F, Xu W. 2016. Selection of a marker gene to construct a reference library for wetland plants, and the application of metabarcoding to analyze the diet of wintering herbivorous waterbirds. PeerJ 4:e2345 https://doi.org/10.7717/peerj.2345

The authors have chosen to make the review history of this article public.

Abstract

Food availability and diet selection are important factors influencing the abundance and distribution of wild waterbirds. In order to better understand changes in waterbird population, it is essential to figure out what they feed on. However, analyzing their diet could be difficult and inefficient using traditional methods such as microhistologic observation. Here, we addressed this gap of knowledge by investigating the diet of greater white-fronted goose Anser albifrons and bean goose Anser fabalis, which are obligate herbivores wintering in China, mostly in the Middle and Lower Yangtze River floodplain. First, we selected a suitable and high-resolution marker gene for wetland plants that these geese would consume during the wintering period. Eight candidate genes were included: rbcL, rpoC1, rpoB, matK, trnH-psbA, trnL (UAA), atpF-atpH, and psbK-psbI. The selection was performed via analysis of representative sequences from NCBI and comparison of amplification efficiency and resolution power of plant samples collected from the wintering area. The trnL gene was chosen at last with c/h primers, and a local plant reference library was constructed with this gene. Then, utilizing DNA metabarcoding, we discovered 15 food items in total from the feces of these birds. Of the 15 unique dietary sequences, 10 could be identified at specie level. As for greater white-fronted goose, 73% of sequences belonged to Poaceae spp., and 26% belonged to Carex spp. In contrast, almost all sequences of bean goose belonged to Carex spp. (99%). Using the same samples, microhistology provided consistent food composition with metabarcoding results for greater white-fronted goose, while 13% of Poaceae was recovered for bean goose. In addition, two other taxa were discovered only through microhistologic analysis. Although most of the identified taxa matched relatively well between the two methods, DNA metabarcoding gave taxonomically more detailed information. Discrepancies were likely due to biased PCR amplification in metabarcoding, low discriminating power of current marker genes for monocots, and biases in microhistologic analysis. The diet differences between two geese species might indicate deeper ecological significance beyond the scope of this study. We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific interactions, as well as new possibilities to investigate interactions between herbivores and plants. In addition, microhistologic analysis should be used together with metabarcoding methods to integrate this information.

Introduction

Wetlands are one of the most important ecosystems in nature, and they harbor a variety of ecosystem services such as protection against floods, water purification, climate regulation and recreational opportunities (Brander, Florax & Vermaat, 2006). Waterbirds are typically wetland-dependent animals upon which they could get abundant food and suitable habitats (Ma et al., 2010). Waterbird abundance and distribution could reflect the status of wetland structure and functions, making them important bio-indicators for wetland health (Fox et al., 2011). Among all factors affecting waterbird community dynamics, food availability is frequently considered to play one of the most important roles (Wang et al., 2013). However, recently suitable food resources have tended to decrease or even disappear due to deterioration and loss of natural wetlands (Fox et al., 2011). As a result, waterbirds are forced to discard previous habitats and sometimes even feed in agricultural lands (Zhang et al., 2011). In addition, migratory waterbirds may aid the dispersal of aquatic plants or invertebrates by carrying and transporting them between water bodies at various spatial scales (Reynolds, Miranda & Cumming, 2015). Consequently, long-time monitoring and systematic studies of waterbird diets are essential to understand population dynamics of waterbirds, as well as to establish effective management programs for them (Wang et al., 2012).

Traditional methods for waterbird diet analysis were direct observation in the field (Swennen & Yu, 2005) or microhistologic analysis of remnants in feces and/or gut contents (James & Burney, 1997; Fox et al., 2007). While these approaches have been proved useful in some cases, they are relatively labor-intensive and greatly skill-dependent (Fox et al., 2007; Samelius & Alisauskas, 1999; Symondson, 2002). Applications of other methods for analyzing gut contents or feces were also restricted due to inherent limitations, as reviewed by Pompanon et al. (2012). Recently, metabarcoding methods, based on high-throughput sequencing, have provided new perspectives for diet analysis and biodiversity assessment (Taberlet et al., 2007; Creer et al., 2010). These methods provide higher taxonomic resolution and higher detectability with enormous sequence output from large-scale environmental samples, such as soil, water and feces (Shokralla, Spall & Gibson, 2012; Bohmann et al., 2014). Owing to these advantages, metabarcoding has been widely employed in the diet analysis of herbivores (Taberlet et al., 2012; Ando et al., 2013; Hibert et al., 2013), carnivores (Deagle, Kirkwood & Jarman, 2009; Shehzad et al., 2012) and omnivores (De Barba et al., 2014). But the pitfalls of metabarcoding should not be ignored when choosing suitable techniques for new studies. For instance, many researches have shown that it is difficult to obtain quantitative data using metabarcoding (Sun et al., 2015). This drawback might result from both technical issues of this method and relevant biological features of samples (Pompanon et al., 2012).

One paramount prerequisite of metabarcoding methods is to select robust genetic markers and corresponding primers (Zhan et al., 2014; Zhan & MacIsaac, 2015). For diet studies of herbivores, at least eight chloroplast genes and two nuclear genes are used as potential markers for land plants (Hollingsworth, Graham & Little, 2011). Although mitochondrial cytochrome coxidase I (COI) is extensively recommended as a standard barcode for animals, its relatively low rate of evolution in botanical genomes precludes it being an optimum for plants (Wolfe, Li & Sharp, 1987; Fazekas et al., 2008). The internal transcribed spacer (ITS) is excluded due to divergence discrepancies of individuals and low reproducibility (Álvarez & Wendel, 2003). A variety of combinations and comparisons have been performed for the eight candidate genes, however, none proved equally powerful for all cases (Fazekas et al., 2008). Consequently, it is more effective to choose barcodes for a circumscribed set of species occurring in a regional community (Kress et al., 2009). Another equally important aspect of metabarcoding applications is the construction of reference libraries which assist taxonomic assignment (Rayé et al., 2011; Xu et al., 2015). It is difficult to accurately interpret sequence reads without a reliable reference library (Elliott & Jonathan Davies, 2014).

Diet analysis is one of the central issues in waterbird research, both for deciphering waterfowl population dynamics and interpreting inter- or intra-specific interactions of cohabitating species (Zhao et al., 2015). For instance, more than 60% of bean goose Anser fabalis and almost 40% of greater white-fronted goose Anser albifrons populations along the East Asian–Australian Flyway Route winter at the Shengjin Lake National Nature Reserve (Zhao et al., 2015). Previous studies based on microhistologic observation illustrated that the dominant composition of their diets was monocotyledons, such as Carex spp. (Zhao et al., 2012), Poaceae (Zhang et al., 2011), and a relatively small proportion of non-monocots (referred to as dicotyledons in the study of ‘Zhao, Cao & Fox, 2013’). However, few food items could be identified to species-level, mainly owing to variable tissue structures within plants, similar morphology between relative species, and a high level of degradation after digestion (Zhang et al., 2011; Zhao et al., 2012; Zhao, Cao & Fox, 2013). Ambiguous identification has hindered understanding of waterbird population dynamics and potential to establish effective conservation plans for them.

In this study, we aimed to improve this situation using the metabarcoding method to analyze diets of these species (see flowchart in Fig. 1). By examining the efficiency of eight candidate genes (rbcL, rpoC1, rpoB, matK, trnH-psbA, trnL (UAA), atpF-atpH, and psbK-psbI), we selected robust genes and corresponding primers for reference library construction and high-throughput sequencing. Subsequently, we used the metabarcoding method to investigate diet composition of these two species based on feces collected from Shengjin Lake. Finally, we discussed and compared results from microhistology and DNA metabarcoding using the same samples to assess the utility and efficiency of these two methods.

Figure 1: Technical flowchart of this study.

Download full-size image

DOI: 10.7717/peerj.2345/fig-1

Materials and Methods

Ethics statement

Our research work did not involve capture or any direct manipulation or disturbances of animals. We collected samples of plants and feces for molecular analyses. We obtained access to the reserve under the permission of the Shengjin Lake National Nature Reserve Administration (Chizhou, Anhui, China), which is responsible for the management of the protected area and wildlife. We were forbidden to capture or disturb geese in the field.

Study area

Shengjin Lake (116°55′–117°15′E, 30°15′–30°30′N) was established as a National Nature Reserve in 1997, aiming to protect waterbirds including geese, cranes and storks. The water level fluctuates greatly in this lake, with maximal water level of 17 m during summer (flood season) but only 10 m during winter (dry season). Due to this fluctuation, receding waters expose two large Carex spp. meadows and provide suitable habitats for waterbirds. This makes Shengjin Lake one of the most important wintering sites for migratory waterbirds (Zhao et al., 2015). Greater white-fronted goose and bean goose are the dominant herbivores wintering (from October to April) in this area, accounting for 40% and 60% of populations along the East Asian–Australian Flyway Route, respectively (Zhao et al., 2015).

Field sampling

The most common plant species that these two geese may consume were collected in May 2014 and January 2015, especially species belonging to Carex and Poaceae. Fresh and intact leaves were carefully picked, tin-packaged in the field and stored at −80 °C in the laboratory before further treatment. Morphological identification was carried out with the assistance of two botanists (Profs. Zhenyu Li and Shuren Zhang from Institute of Botany, Chinese Academy of Sciences).

All feces were collected at the reserve (Fig. 2) in January 2015. Based on previous studies and the latest waterbird survey, sites with large flocks of geese (i.e., more than 200 individuals) were chosen (Zhang et al., 2011). As soon as geese finished feeding and feces were defecated, fresh droppings were picked and stored on dry ice. Droppings of bean geese were generally thicker than those of smaller greater white-fronted goose, to the degree that these could be reliably distinguished in the field (Zhao et al., 2015). Disposal gloves were changed for each sample to avoid cross contamination. To avoid repeated sampling and to make sure samples were from different individuals, each sample was collected with a separation of more than 2 m. In total, 21 feces were collected, including 11 for greater white-fronted goose and 10 for bean goose. All samples were transported to laboratory on dry ice and then stored at −80 °C until further analysis.

Figure 2: The location of our study area, Shengjin Lake National Nature Reserve and our sampling sites.
(Source: http://eros.usgs.gov/#).

Download full-size image

DOI: 10.7717/peerj.2345/fig-2

Selection of molecular markers and corresponding primers

Here, we aimed to select gene markers with adequate discriminating power for our study. We included eight chloroplast genes— rbcL, rpoC1, rpoB, matK, trnH-psbA, trnL (UAA), atpF-atpH, and psbK-psbI for estimation. Although Shengjin Lake included an array of plant species, we focused mainly on the most likely food resources (Xu et al., 2008; Zhao et al., 2015) that geese would consume for candidate gene tests. These covered 11 genera and the family Poaceae (Table S1). For tests of all candidate genes, we recovered sequences of representative species in the selected groups from GenBank (http://www.ncbi.nlm.nih.gov/nuccore). We calculated inter-specific divergence within every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version 6 (Tamura et al., 2013). We also constructed molecular trees based on UPGMA using MEGA and characterized the resolution of species by calculating the percentage of species recovered as monophyletic based on phylogenetic trees (Rf). Secondly, primers selected out of eight candidate genes were used to amplify all specimens collected in Shengjin Lake and to check their amplification efficiency and universality. Thirdly, we calculated inter-specific divergence based on sequences that we obtained from last step. Generally, a robust barcode gene is obtained when the minimal inter-specific distance exceeds the maximal intra-specific distance (e.g., existence of barcoding gaps). Finally, to allow the recognition of sequences after high-throughput sequencing, both of the forward and reverse primers of the selected marker gene were tagged specifically for each sample with 8nt nucleotide codes at the 5′end (Parameswaran et al., 2007).

DNA extraction, amplification and sequencing

Two hundred milligrams of leaf was used to extract the total DNA from each plant sample using a modified CTAB protocol (Cota-Sanchez, Remarchuk & Ubayasena, 2006). DNA extraction of feces was carried out using the same protocol with minor modification in incubation time (elongate to 12 h). Each fecal sample was crushed thoroughly and divided into four quarters. All quarters of DNA extracts were then pooled together. DNA extraction was carried out in a clean room used particularly for this study. For each batch of DNA extraction, negative controls (i.e., extraction without feces) were included to monitor possible contamination.

For plant DNA extracts, PCR amplifications were carried out in a volume of 25 µl with ∼100 ng total DNA as template, 1U of Taq Polymerase (Takara, Dalian, Liaoning Province, China), 1× PCR buffer, 2 mM of Mg^{2 +}, 0.25 mM of dNTPs, 0.1 µM of forward primer and 0.1 µM of reverse primer. After 4 min at 94 °C, the PCR cycles were as follows: 35 cycles of 30 s at 94 °C, 30 s at 56 °C and 45 s at 72 °C, and the final extension was 10 min at 72 °C. We applied the same PCR conditions for all primers. All the successful PCR products were sequenced with Genewiz (Suzhou, Jiangsu Province, China).

For fecal DNA extracts, PCR mixtures (25 µl) were prepared in six replicates for each sample to reduce biased amplification. Each replicate was subjected to the same amplification procedure used for plant extracts. The six replicates of each sample were pooled and purified using the Sangon PCR product purification kit (Sangon Biotech, Shanghai, China). Quantification was carried out to ensure equilibrium of contribution of each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDrop Technologies, Wilmington, Delaware, USA). High-throughput sequencing was performed using Illumina MiSeq platform following manufacturer’s instructions by BGI (Shenzhen, Guangdong Province, China). Reads of high-throughput sequencing could be found at NCBI’s Sequence Read Archive (Accession number: SRP070470).

Data analysis for estimating diet composition

After high-throughput sequencing, pair-ended reads were merged with the fastq_mergepairs command using usearch (http://drive5.com/usearch, Edgar, 2010). Reads were then split into independent files according to unique tags using the initial process of RDP pipeline (https://pyro.cme.msu.edu/init/form.spr). We removed sequences (i) that didn’t perfectly match tags and primer sequences; (ii) that contained ambiguous nucleotide (N’s). Tags and primers were then trimmed using the initial process of RDP pipeline. Further quality filtering using usearch discarded sequences with (i) quality score less than 30 (<Q30) and (ii) length shorter than 100 bp. Unique sequences were clustered to operational taxonomy units (OTUs) at the similarity threshold of 98% (Edgar, 2013). All OTUs were assigned to unique taxonomy with local blast 2.2.30+ (Altschul et al., 1990). We detected a plant within the reference library for each sequence with the threshold of length coverage >98%, identity >98% and e-value <1.0e⁻⁵⁰. If a query sequence matched two or more taxa, it was assigned to a higher taxonomic level which included all taxa.

Microhistology analysis

We used the method described by Zhang et al. (2011) to perform microhistologic examination of fecal samples. Each sample was first washed with pure water and filtered with a 25-µm filter. Subsequently, the suspension was examined under a light microscope at 10× magnification for quantification statistics and at 40× magnification for species identification. We compared photos of visible fragments with an epidermis database of plants from Shengjin Lake to identify food items (Zhang et al., 2011).

Results

Selection of genes and corresponding primers and reference library construction

A total of 3,296 representative sequences were recovered from GenBank, ranging from 0 to 345 sequences per gene per taxon (Table S1). For Eleocharis and Trapa, only sequences of rbcL gene and trnL gene were retained, which makes it unfair to compare the efficiency and suitability of eight candidate genes. For the other ten taxa, trnL, trnH-psbA, rbcL and psbK-psbI showed the largest inter-specific divergence in five, three, one, and one taxonomic groups, respectively. In addition, trnH-psbA, atpF-atpH, trnL and psbK-psbI showed the highest mean divergence in four, four, one and one taxonomic groups, respectively. However, given the small number of sequences and coverage of species, the suitability and efficiency of atpF-atpH and psbK-psbI seem to be less reliable than others. This comparison makes trnH-psbA, trnL and rbcL to be selected out of the eight candidate genes. As matK used to be recommended as the standard barcode gene for Carex species (Starr, Naczi & Chouinard, 2009), which happened to be the dominant food for herbivorous geese in our study (Zhao et al., 2015), we included matK as a supplement at last.

Primers for these four genes (Table 1) were used to amplify the plants that we collected in the field. In total, we collected 88 specimens in the field, belonging to 25 families, 53 genera and 70 species (Table 2). The selected primers for trnL and rbcL successfully amplified 100% and 91% of all species, respectively, while primers for trnH-psbA and matK amplified only 71% and 43%, respectively. Therefore, we chose trnL and rbcL to test their discriminating power in our target plants.

Table 1:

Primers of candidate genes and reference library constructing.

Only the c and h were used for high-throughput sequencing in fusion primer mode (primer + tags). The unique tags were used to differentiate PCR products pooled together for highthroughput sequencing (Parameswaran et al., 2007).

Gene	Primer	Sequence (5′-3′)
matK	matK-XF^a	TAATTTACGATCAATTCATTC
	matK-MALP^b	ACAAGAAAGTCGAAGTAT
rbcL	rbcLa-F^c	ATGTCACCACAAACAGAGACTAAAGC
	rbcLa-R^d	GTAAAATCAAGTCCACCRCG
trnH-psbA	pasbA3_f^e	CGCGCATGGTGGATTCACAATCC
	trnHf_05^f	GTTATGCATGAACGTAATGCTC
trnL	c^g	CGAAATCGGTAGACGCTACG
	h^h	CCATTGAGTCTCTGCACCTATC

DOI: 10.7717/peerj.2345/table-1

Notes:

areferred to Ford et al. (2009).

breferred to Dunning & Savolainen (2010).

creferred to Hasebe et al. (1994).

dreferred to Kress et al. (2009).

ereferred to Tate & Simpson (2003).

freferred to Sang, Crawford & Stuessy (1997).

greferred to Taberlet et al. (1991).

hreferred to Taberlet et al. (2007).

Table 2:

Plant species in the reference library.

We collected these samples from Shengjin Lake.

Species	No. of samples	Species	No. of samples
Curculigo orchioides	1	Trapella sinensis	2
Artemisia capillaris	1	Plantago asiatica	1
Artemisia selengensis	2	Alopecurus aequalis	2
Aster subulatus	1	Beckmannia syzigachne	1
Bidens frondosa	1	Bromus japonicus	1
Erigeron annuus	1	Cynodon dactylon	2
Gnaphalium affine	1	Phalaris arundinacea	1
Hemistepta lyrata	1	Phragmites australis	1
Kalimeris incisa	1	Poa annua	1
Bothriospermum kusnezowii	1	Polypogon fugax	1
Lobelia chinensis	1	Roegneria kamoji	2
Sagina japonica	1	Zizania latifolia	1
Stellaria media	2	Polygonum lapathifolium	4
Calystegia hederacea	1	Polygonum orientale	1
Cardamine lyrata	1	Polygonum perfoliatum	1
Carex heterolepis	3	Polygonum persicaria	1
Carex capricornis	1	Rumex trisetiferus	3
Carex paxii	1	Potamogeton crispus	1
Carex remotiuscula	1	Potamogeton maackianus	1
Fimbristylis dichotoma	1	Potamogeton malaianus	1
Eleocharis migoana	1	Potamogeton natans	1
Scripus karuizawensis	1	Potamogeton pectinatus	1
Nymphoides peltatum	1	Clematis florida	1
Myriophyllum spicatum	1	Ranunculus chinensis	2
Hydrilla verticillta	1	Ranunculus sceleratus	2
Hydrocharis dubia	1	Potentilla freyniana	2
Vallisineria spiralis	1	Gratiola japonica	1
Vallisneria spinulosa	1	Mazus miquelii	2
Juncus effusus	1	Veronica undulata	1
Juncus gracillimus.	1	Trapa bispinosa	1
Leonurus japonicus	1	Trapa maximowiczii	1
Salvia plebeia	1	Trapa pseudoincisa	1
Glycine soja	1	Trapa quadrispinosa	1
Vicia sativa	1	Hydrocotyle sibthorpioides	1
Euryale ferox	1	Torilis japonica	2

DOI: 10.7717/peerj.2345/table-2

We calculated the inter-specific divergence within genera and families with at least two species to compare their discriminating power. Maximal, minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families (Table 3). Neither gene could differentiate species of Vallisneria (mean =0.000 ± 0.000%) or Artemisia (mean =0.000 ± 0.000%). But trnL showed a larger divergence range for the other six genera and five families. Hence, we chose trnL as the barcoding gene for reference library constructing and high-throughput sequencing for our study. The discriminating power of trnL was strong for most species (Table 4). However, some species could only be identified at genus-level or family-level with trnL. For instance, five species of Potamogetonaceae shared the same sequences and this made them to be identified at genus-level. Species could be identified easily to genus and family, except for three grasses (Poaceae) Beckmannia syzigachne, Phalaris arundinacea, and Polypogon fugax which shared identical sequences.

Table 3:

Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter model.

Underscores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al., 2012; Zhao et al., 2015).

Inter-specific divergence	Taxa	rbcL			trnL
		Maximal	Minimal	Mean	Maximal	Minimal	Mean
Within genera	Artemisia	0.000	0.000	0.000 ± 0.000	0.000	0.000	0.000 ± 0.000
	Carex	0.013	0.000	0.008 ± 0.006	0.058	0.000	0.027 ± 0.021
	Polygonum	0.027	0.000	0.010 ± 0.006	0.076	0.000	0.033 ± 0.022
	Potamogeton	0.012	0.000	0.005 ± 0.0034	0.016	0.000	0.005 ± 0.005
	Ranunculus	0.031	0.000	0.020 ± 0.009	0.042	0.021	0.024 ± 0.022
	Trapa	0.000	0.000	0.000 ± 0.000	0.081	0.000	0.049 ± 0.030
	Vallisneria	0.000	0.000	0.000 ± 0.000	0.000	0.000	0.000 ± 0.000
Within families	Cyperaceae	0.043	0.000	0.018 ± 0.010	0.178	0.000	0.084 ± 0.046
	Asteraceae	0.120	0.000	0.049 ± 0.017	0.087	0.000	0.023 ± 0.018
	Poaceae	0.025	0.000	0.016 ± 0.0009	0.166	0.000	0.074 ± 0.039
	Hydrocharitaceae	0.122	0.000	0.078 ± 0.020	0.159	0.000	0.100 ± 0.054
	Polygonaceae	0.043	0.000	0.020 ± 0.009	0.129	0.000	0.031 ± 0.022
	Ranunculaceae	0.033	0.016	0.017 ± 0.015	0.045	0.000	0.018 ± 0.013

DOI: 10.7717/peerj.2345/table-3

Table 4:

Number of species and unique sequences for families with more than one species in Shengjin Lake plant database.

Family	No. of species	No. of sequences
Asteraceae	8	7
Caryophyllaceae	2	2
Cyperaceae	7	5
Fabaceae	2	2
Hydrocharitaceae	4	3
Lamiaceae	2	2
Poaceae	10	8
Polygonaceae	5	5
Potamogetonaceae	5	1
Ranunculaceae	3	3
Scrophulariaceae	3	3
Trapaceae	4	3
Umbelliferae	2	2

DOI: 10.7717/peerj.2345/table-4

Data processing for estimating diet composition

In total, 0.21 and 0.18 million reads were generated for greater white-fronted goose (GWFG) and bean goose (BG), respectively (Table 5). The number of recovered OTUs ranged from 8 to 123 for GWFG and BG samples. We used local BLAST to compare these sequences with the Shengjin Lake reference database. Finally, with DNA metabarocoding, 12 items were discovered in the feces of GWFG, including one at family-level, three at genus-level and eight at species-level (Table 6). Four items were discovered in the feces of BG, including one at genus-level and three at species-level. In total, this method identified 15 taxa in feces of these geese.

Table 5:

Summary of the process and results of high-throughput sequencing analysis.

Sample	Pair-end sequences	Sequences for which primers and tags were identified and with length >100 bp	Unique sequences	OTUs	Food items
GWFG1	16303	8627	1288	78	8
GWFG2	25482	13449	1091	102	8
GWFG3	19063	10056	1277	48	10
GWFG4	23856	12548	1419	114	8
GWFG5	20955	11249	1720	123	9
GWFG6	11677	7205	973	52	9
GWFG7	13377	6782	1328	59	9
GWFG8	7749	3959	774	89	9
GWFG9	16833	8799	1436	90	6
GWFG10	18474	9819	449	32	9
GWFG11	19648	10458	617	31	6
BG1	20225	10254	784	23	4
BG2	14195	7161	564	16	2
BG3	2229	1149	255	12	4
BG4	517	268	77	8	3
BG5	28152	14033	1000	15	3
BG6	16723	8484	740	17	4
BG7	30166	15403	974	15	4
BG8	30928	15706	1028	15	3
BG9	8382	4489	446	13	4
BG10	10714	5526	537	13	4

DOI: 10.7717/peerj.2345/table-5

Notes:

GWFG: Greater white-fronted goose
BG: Bean goose

Table 6:

List of the lowest taxonomic food items in the diet of geese.

Food items	Level of identification	GWFG			BG
		N reads	F_s (%)	F_m (%)	N reads	F_s (%)	F_m (%)
Poaceae spp. (except Poa annua)	Family	51705	47.98	45.68	0	0.00	0.00
Poa annua	Species	23554	21.86	0.00	167	0.20	0.00
Carex heterolepis	Species	18867	17.51	16.39	81457	99.49	62.85
Carex spp. (except Carex heterolepis)	Genus	9706	9.01	2.31	191	0.23	3.49
Alopecurus aequalis	Species	3458	3.21	0.00	0	0.00	0.00
Potentilla chinensis	Species	184	0.17	1.18	65	0.08	2.06
Cynodon dactylon	Species	155	0.14	0.00	0	0.00	0.00
Polygonum spp.	Genus	56	0.05	0.00	0	0.00	0.00
Stellaria media	Species	26	0.02	0.00	0	0.00	0.00
Ranunculus chinensis	Species	14	0.02	0.00	0	0.00	0.00
Lapsana apogonoides	Species	11	0.02	0.00	0	0.00	0.00
Asteraceae sp.	Genus	16	0.01	2.33	0	0.00	14.55
Alopecurus	Genus	0	0.00	30.93	0	0.00	13.18
Carex thunbergii	Species	0	0.00	0.54	0	0.00	2.79
Fabaceae sp.	Genus	0	0.00	0.64	0	0.00	1.08

DOI: 10.7717/peerj.2345/table-6

Notes:

GWFG: Greater white-fronted goose
BG: Bean goose
F_s: percentage of sequences in DNA metabarcoding
F_m: percentage of epidermis squares in microhistological analysis

However, the sequence percentage of each food item varied greatly (Table 6). For GWFG, the majority of sequences (96.36%) were composed of only five items—Poaceae spp. (47.98%, except Poa annua), Poa annua (21.86%), Carex heterolepis (17.51%), Carex spp. (9.01%, except Carex heterolepis), and Alopecurus aequalis (3.21%). For BG, almost all the sequences belonged to Carex heterolepis (99.49%). Other items only occupied a relatively small proportion of sequences. In addition, the presence of each item per sample was also unequal (Table S2). For example in GWFG, Carex heterolepis, Carexspp., Poa annua and Potentilla supina were present in almost all the samples, while Stellaria media, Asteraceae sp. and Lapsana apogonoldes occurred in only about one third of samples.

When the microhistologic examination was performed using the same samples, eight items were found in the feces of greater white-fronted goose, including one at family-level, four at genus-level and three at species-level (Table 6). Dominant items were Poaceae spp. (45.68%), Alopecurus Linn. (30.93%) and Carex heterolepis (16.39%). Seven items were found in the feces of bean goose, including four at genus-level and three at species-level (Table 6). Dominant items were Carex heterolepis (62.85%), Asteraceae sp. (14.55%), and Alopecurus Linn. (13.18%).

Discussion

Marker selection and reference library constructing for diet analysis

With greatly reduced cost, extremely high throughput and information content, metabarcoding has revolutionized the exploration and quantification of dietary analysis with noninvasive samples containing degraded DNA (Fonseca et al., 2010; Shokralla et al., 2014). Despite enormous potential to boost data acquisition, successful application of this technology relies greatly on the power and efficiency of genetic markers and corresponding primers (Bik et al., 2012; Zhan et al., 2014). In order to select the most appropriate marker gene for our study, we compared the performance of eight commonly used chloroplast genes (rbcL, rpoB, rpoC1, matK, trnL, trnH-psbA, atpF-atpH, and psbK-psbI) and their corresponding primers. Although a higher level of discriminating power was shown in several studies, atpF-atpH and psbK-psbI were not as commonly used as other barcoding genes (Hollingsworth, Graham & Little, 2011). As one of the most rapidly evolving coding genes of plastid genomes, matK was considered as the closest plant analogue to the animal barcode COI (Hilu & Liang, 1997). However, matK was difficult to amplify using available primer sets, with only 43% of successful amplification in this study. In spite of the higher species discrimination success of trnH-psbA than rbcL + matK in some groups, the presence of duplicated loci, microinversions and premature termination of reads by mononucleotide repeats lead to considerable proportion of low-quality sequences and over-estimation of genetic difference when using trnH-psbA (Graham et al., 2000; Whitlock, Hale & Groff, 2010). In contrast, the barcode region of rbcL is easy to amplify, sequence, and align in most plants and was recommended as the standard barcode for land plants (Chase et al., 2007). The relatively modest discriminating power (compared to trnL) precludes its application for our study aiming to recover high resolution of food items. Consequently, trnL was selected out of eight candidate markers, with 100% amplification success, more than 90% of high quality sequences, and relatively large inter-specific divergence.

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack of a comprehensive reference library, without which it is impossible to accurately interpret and assign sequences generated from high-throughput sequencing (Valentini, Pompanon & Taberlet, 2009; Barco et al., 2015). In this study, we constructed a local reference library by amplifying the most common species (70 morpho-species in total) during the wintering period with the trnL gene. Although not all of them could be identified at species-level with trnL due to relatively low inter-specific divergence, many species could be separated with distinctive sequences. Previous studies have recommended group-specific barcodes to differentiate closely related plants at the species level (Li et al., 2015). For instance, matK has been proved to be more efficient for the discrimination of Carex spp. (Starr, Naczi & Chouinard, 2009). However, the primer set of matK failed to amplify species of Carex spp. in our study, suggesting the universality of selected primer pairs should be tested in each study (Zhan et al., 2014).

Applications of metabarcoding for geese diet analysis

A variety of recent studies have demonstrated the great potential of metabarcoding for dietary analysis, mainly owing to the high throughput, high discriminating power, and the ability to process large-scale samples simultaneously (Creer et al., 2010; Taberlet et al., 2012; Shehzad et al., 2012). In this study, we applied this method to recover diets of herbivorous geese and provided standard protocols for dietary analysis of these two ecologically important waterbirds. Our results further proved the more objective, less experience-dependent and more time-efficient character of DNA metabarcoding. However, not all the species in the reference library could be identified at species-level, owing to low inter-specific divergence. We suggest that multiple group-specific markers to be incorporated in the future, as in De Barba et al. (2014). Two species, Carex thunbergii and Fabaceae sp., were only discovered via microhistologic analysis rather than metabarcoding. This failure might reflect the biased fragment amplification of current technology, of which dominant templates could act as inhibitors of less dominant species (Piñol et al., 2015). However, three species of Poaceae were only discovered using metabarcoding. In total, more taxa and higher resolution were attained using metabarcoding. But microhistology still proved a powerful supplementary. Previous studies using metabarcoding usually detected dozens of food items, even as many as more than one hundred species. For instance, 18 taxa prey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al., 2012); 44 plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens) (Ando et al., 2013); while more than 100 taxa were found in diet studies of brown bear (Ursus arctos) (De Barba et al., 2014). The relatively narrow diet spectrum of herbivorous geese may lead to misunderstanding that this result of our study is merely an artefact due to small sampling effort. However, this result is credible since these two geese species only feed on Carex meadow, where the dominant vegetation is Carex spp., with other species such as Poaceae and dicots (Zhao et al., 2015). Even though other wetland plants exist, they usually composed only a small proportion of the geese diets.

Quantification of food composition is another key concern in dietary analysis. Although the relative percentage of sequences was not truly a quantitative estimate of diet, taxa of the majority sequences in this study were in accord with microhistologic observations, which was considered an efficient way to provide quantitative results (Wang et al., 2013). Discrepancies might come from the semi-quantitative nature of metabarcoding methods (Sun et al., 2015). This is likely derived from PCR amplification, which always entails biases caused by universal primer-template mismatches, annealing temperature or number of PCR cycles (Zhan et al., 2014; Piñol et al., 2015). Other methods such as shotgun sequencing or metagenomic sequencing could be incorporated in the future to give information on abundances of food items (Srivathsan et al., 2015).

Implications for waterbird conservation and wetland management

For long-distance migratory waterbirds, such as the wild geese in this study, their abundance and distribution are greatly influenced by diet availability and habitat use (Wang et al., 2013). For example, waterbirds may be restricted at (forced to leave) certain areas due to favoring (loss) of particular food (Wang et al., 2013), while the recovery of such food may contribute to return of bird populations (Noordhuis, Van der Molen & Van den Berg, 2002). Results of both metabarcoding and microhistologic analysis in this study revealed that Carex and Poaceae were dominant food components which is in accordance with previous studies. The increasing number of these two geese wintering at the Shengjin Lake may be attributed to the expansion of Carex meadow, which offers access to abundant food resources (Zhao et al., 2015). Considering the long-distance migratory character of these birds, it is important to maintain energy balances and good body conditions in wintering areas because this might further influence their departure dates and reproductive success after arriving at breeding areas (Prop, Black & Shimmings, 2003). Based on this, it is important for wetland managers to maintain the suitable habitats and food resources for sustainable conservation of waterbirds, which highlights the significance of diet information. Our study also indicated that overlap and dissimilarity existed between the diets of these two geese. Animals foraging in the same habitats may compete for limited food resources (Madsen & Mortensen, 1987). This discrepancy of food composition may arise from the avoidance of inter-specific competition (Zhao et al., 2015). However, with the increase of these two species in Shengjin Lake, further research is needed to investigate the mechanisms of food resource partitioning and spatial distribution.

Shengjin Lake is one of the most important wintering sites for tens of thousands of migratory watebirds, while annual life cycles of these birds depend on the whole migratory route, including breeding sites, stop-over sites and wintering sites (Kear, 2005). Thus, a molecular reference library covering all the potential food items along the whole migratory route will be useful both for understanding of wetland connections and waterbird conservation. In addition, the ability of DNA metabarcoding to process lots of samples simultaneously enables rapid analyses and makes this method helpful for waterbird studies.

Supplemental Information

Divergence of dominant genera/families with representative sequences retrieved from GenBank. The resolvable power indicated the percentage of identified species

DOI: 10.7717/peerj.2345/supp-1

Download

Plant taxa identified in the feces of greater white-fronted goose (GWFG) and bean goose (BG) in Shengjin Lake National Nature Reserve

The resolvable power indicated the percentage of identified species.

DOI: 10.7717/peerj.2345/supp-2

Download

[1] Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. Journal of Molecular Biology 215:403-410

[2] Álvarez I, Wendel JF. 2003. Ribosomal ITS sequences and plant phylogenetic inference. Molecular Phylogenetics and Evolution 29:417-434

[3] Ando H, Setsuko S, Horikoshi K, Suzuki H, Umehara S, Inoue-Murayama M, Isagi Y. 2013. Diet analysis by next—generation sequencing indicates the frequent consumption of introduced plants by the critically endangered red—headed wood pigeon (Columba janthina nitens) in oceanic island habitats. Ecology and Evolution 3:4057-4069

[4] Barco A, Raupach MJ, Laakmann S, Neumann H, Knebelsberger T. 2015. Identification of North Sea molluscs with DNA barcoding. Molecular Ecology Resources 16:288-297

[5] Bik HM, Porazinska DL, Creer S, Caporaso JG, Knight R, Thomas WK. 2012. Sequencing our way towards understanding global eukaryotic biodiversity. Trends in Ecology and Evolution 27:233-243

[6] Bohmann K, Evans A, Gilbert MTP, Carvalho GR, Creer S, Knapp M, Yu WD, De Bruyn M. 2014. Environmental DNA for wildlife biology and biodiversity monitoring. Trends in Ecology and Evolution 29:358-367

[7] Brander LM, Florax RJ, Vermaat JE. 2006. The empirics of wetland valuation. a comprehensive summary and a meta-analysis of the literature. Environmental and Resource Economics 33:223-250

[8] Chase MW, Cowan RS, Hollingsworth PM, Van den Berg C, Madriñán S, Petersen G, Seberg O, Jorgsensen T, Cameron KM, Carine M, Pedersen N, Hedderson TAJ, Conrad F, Salazar GA, Richardson JE, Hollingsworth M, Barraclough TG, Kelly L, Wilkinson M. 2007. A proposal for a standardised protocol to barcode all land plants. Taxon 56:295-299

[9] Cota-Sanchez JH, Remarchuk K, Ubayasena K. 2006. Ready-to-use DNA extracted with a CTAB method adapted for herbarium specimens and mucilaginous plant tissue. Plant Molecular Biology Reporter 24:161-167

[10] Creer S, Fonseca VG, Porazinska DL, Giblin-Davis RM, Sung W, Power DM, Packer M, Carvalho GR, Blaxter ML, Lambshead P, Thomas WK. 2010. Ultrasequencing of the meiofaunal biosphere. Practice, pitfalls and promises. Molecular Ecology 19:4-20

[11] De Barba M, Miquel C, Boyer F, Mercier C, Rioux D, Coissac E, Taberlet P. 2014. DNA metabarcoding multiplexing and validation of data accuracy for diet assessment: application to omnivorous diet. Molecular Ecology Resources 14:306-323

[12] Deagle BE, Kirkwood R, Jarman SN. 2009. Analysis of Australian fur seal diet by pyrosequencing prey DNA in faeces. Molecular Ecology 18:2022-2038

[13] Dunning LT, Savolainen V. 2010. Broad-scale amplification of matK for DNA barcoding plants, a technical note. Botanical Journal of the Linnean Society 164:1-9

[14] Edgar RC. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460-2461

[15] Edgar RC. 2013. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 10:996-998

[16] Elliott TL, Jonathan Davies T. 2014. Challenges to barcoding an entire flora. Molecular Ecology Resources 14:883-891

[17] Fazekas AJ, Burgess KS, Kesanakurti PR, Graham SW, Newmaster SG, Husband BC, Percy DM, Hajibabaei M, Barrett SC. 2008. Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PLoS ONE 3:e2802

[18] Fonseca VG, Carvalho GR, Sung W, Johnson HF, Power DM, Neill SP, Packer M, Blaxter ML, Labmshead PJD, Thomas WK, Creer S. 2010. Second-generation environmental sequencing unmasks marine metazoan biodiversity. Nature Communications 1:Article 98

[19] Ford CS, Ayres KL, Haider N, Toomey N, Van-Alpen-Stohl J. 2009. Selection of candidate DNA barcoding regions for use on land plants. Botanical Journal of the Linnean Society 159:1-11

[20] Fox AD, Bergersen E, Tombre IM, Madsen J. 2007. Minimal intra-seasonal dietary overlap of barnacle and pink-footed geese on their breeding grounds in Svalbard. Polar Biology 30:759-768

[21] Fox AD, Cao L, Zhang Y, Barter M, Zhao M, Meng F, Wang S. 2011. Declines in the tuber-feeding waterbird guild at Shengjin Lake National Nature Reserve, China–a barometer of submerged macrophyte collapse. Aquatic Conservation: Marine and Freshwater Ecosystems 21:82-91

[22] Graham SW, Reeves PA, Burns AC, Olmstead RG. 2000. Microstructural changes in noncoding chloroplast DNA: interpretation, evolution, and utility of indels and inversions in basal angiosperm phylogenetic inference. International Journal of Plant Sciences 161:S83-S96

[23] Hasebe M, Omori T, Nakazawa M, Sano T, Kato M, Iwatsuki K. 1994. rbcL gene sequences provide evidence for the evolutionary lineages of leptosporangiate ferns. Proceedings of the National Academy of Sciences of the United States of America 91:5730-5734

[24] Hibert F, Taberlet P, Chave J, Scotti-Saintagne C, Sabatier D, Richard-Hansen C. 2013. Unveiling the diet of elusive rainforest herbivores in next generation sequencing era? The tapir as a case study. PLoS ONE 8:e60799

[25] Hilu KW, Liang H. 1997. The matK gene: sequence variation and application in plant systematics. American Journal of Botany 84:830-839

[26] Hollingsworth PM, Graham SW, Little DP. 2011. Choosing and using a plant DNA barcode. PLoS ONE 6:e19254

[27] James HF, Burney DA. 1997. The diet and ecology of Hawaii’s extinct flightless waterfowl: evidence from coprolites. Biological Journal of the Linnean Society 62:279-297

[28] Kear J. 2005. Ducks, geese and swans. Oxford: Oxford University Press.

[29] Kress WJ, Erickson DL, Jones FA, Swenson NG, Perez R, Sanjur O, Bermingham E. 2009. Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama. Proceedings of the National Academy of Sciences of the United States of America 106:18621-18626

[30] Li X, Yang Y, Henry RJ, Rossetto M, Wang Y, Chen S. 2015. Plant DNA barcoding: from gene to genome. Biological Reviews 90:157-166

[31] Ma Z, Cai Y, Li B, Chen J. 2010. Managing wetland habitats for waterbirds: an international perspective. Wetlands 30:15-27

[32] Madsen J, Mortensen CE. 1987. Habitat exploitation and interspecific competition of moulting geese in East Greenl. Ibis 129:25-44

[33] Noordhuis R, Van der Molen DT, Van den Berg MS. 2002. Response of herbivorous water-birds to the return of Chara in Lake Veluwemeer, The Netherlands. Aquatic Botany 72:349-367

[34] Parameswaran P, Jalili R, Tao L, Shokralla S, Gharizadeh B, Ronaghi M, Fire AZ. 2007. A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing. Nucleic Acids Research 35:e130

[35] Piñol J, Mir G, Gomez-Polo P, Agustí N. 2015. Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods. Molecular Ecology Resources 15:819-830

[36] Pompanon F, Deagle BE, Symondson WO, Brown DS, Jarman SN, Taberlet P. 2012. Who is eating what: diet assessment using next generation sequencing. Molecular Ecology 21:1931-1950

[37] Prop J, Black JM, Shimmings P. 2003. Travel schedules to the high arctic: barnacle geese trade—off the timing of migration with accumulation of fat deposits. Oikos 103:403-414

[38] Rayé G, Miquel C, Coissac E, Redjadj C, Loison A, Taberlet P. 2011. New insights on diet variability revealed by DNA barcoding and high-throughput sequencing: chamois diet in autumn as a case study. Ecological Research 26:265-276

[39] Reynolds C, Miranda NA, Cumming GS. 2015. The role of waterbirds in the dispersal of aquatic alien and invasive species. Diversity and Distributions 21:744-754

[40] Samelius G, Alisauskas RT. 1999. Diet and growth of glaucous gulls at a large Arctic goose colony. Canadian Journal of Zoology 77:1327-1331

[41] Sang T, Crawford DJ, Stuessy TF. 1997. Chloroplast DNA phylogeny, reticulate evolution, and biogeography of Paeonia (Paeoniaceae) American Journal of Botany 84:1120-1136

[42] Shehzad W, Riaz T, Nawaz MA, Miquel C, Poillot C, Shah SA, Pompanon F, Coissac E, Taberlet P. 2012. Carnivore diet analysis based on next-generation sequencing: application to the leopard cat (Prionailurus bengalensis) in Pakistan. Molecular Ecology 21:1951-1965

[43] Shokralla S, Gibson JF, Nikbakht H, Janzen DH, Hallwachs W, Hajibabaei M. 2014. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens. Molecular Ecology Resources 14:892-901

[44] Shokralla S, Spall JL, Gibson JF. 2012. Next-generation sequencing technologies for environmental DNA research. Molecular Ecology 21:1794-1805

[45] Srivathsan A, Sha J, Vogler AP, Meier R. 2015. Comparing the effectiveness of metagenomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrix nemaeus) Molecular Ecology Resources 15:250-261

[46] Starr JR, Naczi RFC, Chouinard BN. 2009. Plant DNA barcodes and species resolution in sedge (Carex, Cyperaceae) Molecular Ecology Resources 9:151-163

[47] Sun C, Zhao Y, Li H, Dong Y, MacIsaac HJ, Zhan A. 2015. Unreliable quantification of species abundance based on high-throughput sequencing data of zooplankton communities. Aquatic Biology 24:9-15

[48] Swennen C, Yu YT. 2005. Food and feeding behavior of the black-faced Spoonbill. Waterbirds 28:19-27

[49] Symondson WOC. 2002. Molecular identification of prey in predator diets. Molecular Ecology 11:627-641

[50] Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E. 2012. Towards next-generation biodiversity assessment using DNA metabarcoding. Molecular Ecology 21:2045-2050

[51] Taberlet P, Coissac E, Pompanon F, Gielly L, Miquel C, Valentini A, Vermat T, Corthier G, Brocmann C, Willerslev E. 2007. Power and limitations of the chloroplast trnL (UAA) intron for plant DNA barcoding. Nucleic Acids Research 35:e14

[52] Taberlet P, Gielly L, Pautou G, Bouvet J. 1991. Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Molecular Biology 17:1105-1109

[53] Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: molecular evolutionary genetics analysis version 6.0. MolecularBiology and Evolution 30:2725-2729

[54] Tate JA, Simpson BB. 2003. Paraphyly of Tarasa (Malvaceae) and diverse origins of the polyploidy species. Systematic Botany 28:723-737

[55] Valentini A, Pompanon F, Taberlet P. 2009. DNA barcoding for ecologists. Trends in Ecology and Evolution 24:110-117

[56] Wang X, Fox AD, Cong P, Barter M, Cao L. 2012. Changes in the distribution and abundance of wintering Lesser White-fronted Geese Anser erythropus in eastern China. Bird Conservation International 22:128-134

[57] Wang X, Fox AD, Cong P, Cao L. 2013. Food constraints explain the restricted distribution of wintering Lesser white-fronted Geese Anser erythropus in China. Ibis 155:576-592

[58] Whitlock BA, Hale AM, Groff PA. 2010. Intraspecific inversions pose a challenge for the trnH-psbA plant DNA barcode. PLoS ONE 5:e11533

[59] Wolfe KH, Li WH, Sharp PM. 1987. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proceedings of the National Academy of Sciences of the United States of America 84:9054-9058

[60] Xu C, Dong W, Shi S, Cheng T, Li C, Liu Y, Wu P, Wu H, Gao P, Zhou S. 2015. Accelerating plant DNA barcode reference library construction using herbarium specimens: improved experimental techniques. Molecular Ecology Resources 15:1366-1374

[61] Xu L, Xu W, Sun Q, Zhou Z, Shen J, Zhao X. 2008. Flora and vegetation in Shengjin Lake. Journal of Wuhan Botanical Research 27:264-270

[62] Zhan A, Bailey SA, Heath DD, MacIsaac HJ. 2014. Performance comparison of genetic markers for high-throughput sequencing-based biodiversity assessment in complex communities. Molecular Ecology Resources 14:1049-1059

[63] Zhan A, MacIsaac HJ. 2015. Rare biosphere exploration using high-throughput sequencing: research progress and perspectives. Conservation Genetics 16:513-522