Visitors   Views   Downloads

Hiding in plain sight: DNA barcoding suggests cryptic species in all ‘well-known’ Australian flower beetles (Scarabaeidae: Cetoniinae)

View article
20 days ago
RT @AustmusResearch:
RT @AustmusResearch:
#AmazingBeetles Barcoding Australian Cetoniinae via @thePeerJ
RT @thePeerJ: Just published in #PeerJ - Hiding in plain sight: DNA barcoding suggests cryptic species in all ‘well-known’ Australian flowe…
23 days ago
RT @thePeerJ: Just published in #PeerJ - Hiding in plain sight: DNA barcoding suggests cryptic species in all ‘well-known’ Australian flowe…
Just published in #PeerJ - Hiding in plain sight: DNA barcoding suggests cryptic species in all ‘well-known’ Australian flower beetles (Scarabaeidae: Cetoniinae) #biodiversity #entomology #taxonomy #zoology
Biochemistry, Biophysics and Molecular Biology
  1. July 8, 2020: Minor Correction: The authors reported an error in Table 1 after publication. The fifth line of data in Table 1 should have been removed. The corrected Table 1 is now available here.


The cosmopolitan scarab beetle subfamily Cetoniinae, or flower beetles, comprises 4,273 species worldwide in 485 genera (Krajčík, 2012). The common species are well-represented in institutional and private collections in Australia, and one early collector, F.P. Dodd, arranged hundreds of colourful specimens in large display frames for exhibition (Monteith, 2010). Despite their visibility, the taxonomy of Australian cetoniines has been somewhat neglected until recent times, with only 10 works in scientific literature and 16 species described in the 65 years from 1944 to 2009. Previous taxonomic work on Australian fauna is detailed in Moeseneder et al. (2019).

Approximately 75% of the country’s cetoniine species are anthophagous. The adults are pollinators of many tree and shrub species (Williams & Adam, 1998) and feed on nectar, pollen (Moore, 1987), fruit and honey. In the remainder of the species males are often in flight, females are sedentary, and adults are rarely or never found on flowers (Moeseneder et al., 2019). Australian cetoniines are not known to be harmful to agriculture (Moeseneder et al., 2019). The larvae of most species live in and feed on decaying wood and function as organic recyclers, within standing or fallen trees. Much remains to be discovered about the biology of Australian Cetoniinae.

The Australian cetoniine fauna is relatively depauperate, comprising 141 species (Moeseneder et al., 2019), or 3% of the world fauna on 5% of the land surface. Twenty-eight of the 40 Australian genera (70%) and 90% of species are endemic to the continent. Three of the twelve cetoniine tribes are represented in Australia: Schizorhinini, Cetoniini and Valgini. The Schizorhinini evolved in situ (Krikken, 1984) and is the continent’s most speciose tribe, with 111 described species. The majority of Schizorhinini, however, occur from Malaysia eastward to Melanesia (Allard, 1995a; Allard, 1995b; Rigout & Allard, 1997).

While attractive and common species are better known, those with unusual characteristics and those that occur in remote regions were often lumped into unnatural genera, primarily in Schizorhina Kirby, 1825 (e.g., Macleay, 1863;Macleay, 1871), Diaphonia Newman, 1840 (e.g., Janson, 1873; Janson, 1874;Janson, 1889) and Pseudoclithria Van de Poll, 1886 (e.g., Macleay, 1871). Such oddities have been the focus of the authors’ (C.H.M. and P.M.H.) past work (detailed in Moeseneder et al., 2019), describing four new genera and seven new species. Based on the published literature and our own observations, we suspected cryptic species to be present in Diaphonia, Pseudoclithria, Bisallardiana Antoine, 2003, Chondropyga Kraatz, 1880, Chlorobapta Kraatz, 1880 and Glycyphana Burmeister, 1842. This DNA barcoding study is a first step towards resolving these taxonomic issues.

DNA barcoding is a widely used tool in taxonomy, with >6,000 papers published (BOLD, 2020) since its inception 17 years ago (Hebert et al., 2003). It has overcome many of its early controversies as methods have matured and its utility in taxonomy, ecology and conservation has become widely appreciated (DeSalle & Goldstein, 2019). Lepidopterists, in particular, have embraced DNA barcoding as the very large datasets that have been developed for the faunas of North America (Hebert, DeWaard & Landry, 2010), Europe (Hausmann et al., 2011) and Australia (Hebert et al., 2013) facilitate routine species identification and aid taxonomic research in many families. Much progress is being made with beetles as well, e.g., (Hendrich, 2015) published 16,000 barcodes for 3,500 European species.

The few DNA-based studies to date that have included the Cetoniinae are summarized in Table 1 (refer to Methods for the search methodology used). Most studies were either higher-level phylogenies which used a single sample per species (e.g., Ahrens et al., 2008; Ahrens et al., 2011; Gunter et al., 2016; McKenna et al., 2015; Šípek et al., 2016) or studies of a single genus. A notable exception was the DNA barcoding study of Hendrich et al. (2015) which included 70 samples from 14 European cetoniine species. A search of the Barcode of Life Data System (BOLD; Ratnasingham & Hebert, 2007) Public Data Portal for “Cetoniinae Australia” (on 19 March 2020) revealed only 29 barcodes of Australian cetoniines so far. Fifteen of these were from Gunter et al. (2016) and identified only to genus level. Twelve of the remaining barcodes represent three common species which we sampled in this study. Approximately half of these were identified using the BOLD Identification Engine (IDE). Of the remaining two barcodes, one species had been misidentified at the generic level. None of these 29 sequences were included in our study as they did not add species or DNA diversity to our data set, and we did not have access to the specimens to identify taxa previously identified only to genus level.

Table 1:
Previous DNA-based studies of Cetoniinae.
Reference Number of cetoniine samples Number of cetoniine species sampled (Australian) Gene regions sampled Taxon focus Study purpose
Ahrens, Monaghan & Vogler (2007) 11 2 (0) COI, 16S, 28S Scarabaeidae Larval-adult association
Ahrens & Vogler (2008) 8 8 (1) COI, 16S, 28S Sericini Higher-level phylogeny
Ahrens, Scott & Vogler (2011) 7 7 (0) COI, 16S, 28S Hopliini Higher-level phylogeny
Ahrens et. al (2013) 230 1 (0) COI, ITS1 Cetonia aurata complex Phylogeography, species-level taxonomy
26 1 (0) COI Osmoderma Species-level taxonomy
Audisio (2009) 26 5 (0) COI Osmoderma Species-level taxonomy
Gunter et al. (2016) 15 15 (15) COI, 16S, 12S, 28S Scarabaeinae Higher-level phylogeny
Han et al. (2017) 16 3 (0) COI Osmoderma Species-level taxonomy
Hendrich (2015) 70 14 (0) COI European Coleoptera DNA barcoding
Kim et al. (2014) 1 1 (0) Mitogenome Protaetia brevitarsis Genomics
Landvik, Wahlberg & Roslin (2013) 7 1 (0) COI Osmoderma Species identification
Lee et al. (2015) 50 5 (0) COI, 16S Dicronocephalus Species-level phylogeny
McKenna et al. (2015) 5 5 (1?) 28S, CAD Staphyliniformia, Scarabaeiformia Higher-level phylogeny
Philips et al. (2016) 12 11 (0) COI, 28S Trichiotinus Species-level phylogeny
Seidel (2016) 29 5 (0) COI Eudicella Species-level taxonomy
Šípek et al. (2016) 130 125 (2) COI, 16S, 28S Cetoniinae Higher-level phylogeny
Song & Zhang (2018) 4 4 (0) 5 mitogenomes Scarabaeidae Genomics, higher-level phylogeny
Svensson et al. (2009) 38 5 (0) COI Osmoderma Species identification
Vondracek et al. (2018) 65 15 (0) COI, CytB Potosia Species-level taxonomy
Zauli et al. (2016) 27 1 (0) COI Osmoderma Species-level taxonomy
DOI: 10.7717/peerj.9348/table-1

Before our submissions, a search in GenBank (10 October 2019) for “Cetoniinae” and (“Cytochrome c oxidase subunit I” or “COI” or “CO1” or “COX1”) found 1260 records, of which 47% are from two genera, Osmoderma Lepeltier & Serville 1828 and Protaetia Burmeister 1842. Our submitted records increase this number by 23%.

The goals of this study are to build a foundational DNA barcode library for Australian Cetoniinae with the purpose of aiding the discovery of Australian species, anchoring the process of revising their taxonomy, facilitating identification of larvae and aiding investigations into the little-known biology of Australian cetoniines, particularly through future metabarcoding studies.

Materials & Methods

Insect specimens and taxonomy

Our study covers Australia, including its external territories, although of these, cetoniines are known to be present only on Christmas Island and the Cocos (Keeling) Islands. Collecting permits were provided by the Queensland Department of Environment and Science (permit numbers WITF18701717, WlTK15549915, WlTK10612112, WITK05498008, TWB/02/2015 , TWB/03B/2012, TWB/04A/2010, TWB/27B/2010, TWB/27A/2008 and TWB/26/2008), the NSW National Parks and Wildlife Service (permit number SL100610) and the Western Australia Department of Biodiversity, Conservation and Attractions (permit numbers F025000050, 08-000563-2 and SF008817).

Images of specimens were taken with a Nikon D5100 camera, a Micro Nikkor 105 mm macro lens and four 3-Watt LED lights. The camera was controlled by Nikon Camera Control Pro 2 version 2.28.2 from a laptop computer. Focus-stacking was performed with a unit built by C.H.M. (Moeseneder, 2017).

Male genitalia were removed by separating the abdomen from the thorax by sliding Dumont #5 tweezers in the gap between abdomen and thorax at several points, usually requiring the metatibia to be forced slightly away from the abdomen. The aedeagus was then extracted from the abdomen and the abdomen re-attached with cyanoacrylate glue. The aedeagus was mounted with a micro pin into a small foam piece which was pinned on the same pin as the specimen. A small amount of cyanoacrylate glue was applied where the micro pin pierced the aedeagus to keep it from rotating or being lost. The method allowed rapid extraction without externally visible damage, storage of the aedeagus with the specimen, and three-dimensional inspection of the aedeagus at any time without obscuring any part. For species identification, all collections which are listed in the abbreviations were used.

In all but one case, sampled larvae were the progeny of mated and identified beetles held in closed containers. Tissue samples (one rear leg or all legs on one side) were generally taken at L3 stage. These species were easily identified and mostly common. The single exception, MIC60567-002 was an unidentified, wild-collected larva.

Identifications were based on knowledge of described and undescribed species gained from (1) examination of specimens and images of type material of a large majority of species in the collections listed in Methods and Materials, (2) the original descriptions of all taxa in literature, (3) all published literature on Australian cetoniines, and (4) a database maintained by CHM and PMH which lists current name availability and synonymies.

Collection data and images of each specimen were uploaded to BOLD as public project AUCET, Australian Cetoniinae.

Where potentially undescribed species are mentioned in this work, they are identified by a code in the format sp_xxx_chm where the ‘xxx’ is a unique code. Their taxonomy will be resolved in future studies.

To find previous taxonomic and phylogenetic studies that produced DNA data for Cetoniinae we: (1) searched for the keywords “Cetoniinae and DNA or molecular” in Web of Science (, (2) performed a Google search with the same keywords, (3) performed a search of GenBank for Cetoniinae COI sequences, and a subsequent search of Google Scholar for the studies that produced those sequences, and (4) consulted the reference list in each paper that was found.

We use the term ”well-known species” for our subjective measure of those Australian cetoniine species (1) with high numbers of specimens in collections, (2) which have more often been used to represent the subfamily, for example in literature and displays, (3) with larger numbers of records in The Atlas of Living Australia (, and (4) which are seen by the public in backyards and parks, and hence reported to museums, mentioned in digital media posts (e.g., Flickr, and citizen science projects (e.g., QuestaGame,

DNA barcoding

An initial trial round of sampling from both, archival specimens and those collected within the last approximately 10 years, produced a high rate of unsuccessful DNA extractions. Thereafter, the standardized sampling procedure described below was implemented which increased the success rate of DNA sequencing to 98%.

We sampled one to 12 specimens per species (mean = 3.34, median = 3) maximizing geographic coverage where possible. The small sample size for some species is due to their scarcity and the lack of availability of recent material. Live adult specimens were collected directly into laboratory-grade ethanol and samples for DNA extraction were taken immediately after death. Sampling was performed by removing the rear left leg with forceps, which were sterilized between samples by wiping with a clean tissue, dipping in 100% ethanol and flaming. A new, sterile surgical blade was used to cut the femur at both apices to exclude the joints. The central part of the femur was cut into two or more fragments to expose muscle tissue. Approximately equal-sized samples were used across all taxa to obtain comparable DNA concentrations. Samples were transferred to a tissue sample plate with sterilized forceps. During this process, all neighbouring wells were kept covered to reduce the chance of contamination. The sampling plate was stored in a freezer at approximately −12 °C. Exceptions to this sampling protocol were: (1) the 18 specimens collected in flight intercept traps which were killed in a mixture of propylene glycol and water, and transferred to ethanol after approximately 2–4 weeks, and (2) Microvalgus Kraatz, 1883 specimens, where the entire specimen was macerated and used for sequencing. In these cases, the samples were one of a series of specimens collected at the same time, on the same tree and morphologically identical. In each case the series of specimens was kept as reference material.

Since specimen age ranged from 1 to 22 years, we used the PCR primers and amplification strategy developed by Mitchell (2015) for decades-old insect specimens. In summary, an attempt was made to PCR-amplify a 667-bp fragment of COI. If this was unsuccessful, two shorter overlapping PCR fragments, each approximately 300 bp were amplified, and subsequently reamplified using an internally nested primer on one end. When aligned, the two short fragments yielded 559 bp of contiguous COI sequence within the DNA barcode region. PCR fragments were purified and Sanger-sequenced in both directions on an ABI 3730xl sequencer by Macrogen Inc. (Seoul).

Sequence trace files were assembled, PCR primers were trimmed, and consensus sequences aligned using Geneious 9.1.8 (Kearse et al., 2012). Trace files and consensus sequences were uploaded to BOLD ( and are available as public project Australian Cetoniinae, project code AUCET, and as dataset DS-AUCET ( Sequences were also submitted to GenBank as accession numbers MT323780MT324061. Note that two sequences were <200 bp in length and did not receive GenBank accession numbers, but are on BOLD.

The BOLD platform was used for barcode-specific analyses, including the calculation of intraspecific and within-genus interspecific K2P distances (Kimura, 1980), barcode gap analysis and BIN discordance analysis, i.e., comparison of morphology-based species identifications with Barcode Index Numbers (BINs) which are operational taxonomic units (OTUs) derived using RESL clustering (Ratnasingham & Hebert, 2013). We note, however, that sequences that do not meet all quality criteria, including for length, are not assigned to BINs. Therefore, for a more complete comparison of OTUs based on RESL clustering versus morphospecies, we also performed RESL clustering on all sequences using the “cluster sequences” function on BOLD. Finally, we tested for possible isolation by distance within every species, using the Geographic Distance Correlation tool on the BOLD platform, which calculates a Mantel correlation coefficient for geographic distance between sample localities versus K2P distance, and provides a Mantel test P value.

FaBox v. 1.4.2 (Villesen, 2007) was used to edit sequence names. Phylogenetic analyses were performed on the online science gateway CIPRES v. 3.3 (Miller, Pfeiffer & Schwartz, 2010). Partitionfinder v.2 (Lanfear et al., 2016) was used to select a partitioning scheme and to select the most appropriate models, which, in all cases, was a single data partition and the General Time Reversible model with Gamma-distributed rates and Invariable sites (GTR + G + I). Phylogenetic analyses were performed by Bayesian Inference (BI) using MrBayes v. 3.2.6 (Ronquist et al., 2012) and under maximum likelihood using RAxML v. 8.2.10 (Stamatakis, 2014). The MrBayes analysis was set to run for 20 million generations, with a sample frequency of 1,000, using 2 runs, setting the number of chains to 4. The stopping rule was used to end the analysis when the average standard deviation of split frequencies dropped below 0.01, indicating convergence of the chains. The burnin fraction was set to 0.25. RAxML analysis used the hill climbing algorithm with 1,000 rapid non-parametric bootstrap replicates (Felsenstein, 1985). All trees were rooted on Valgini (Microvalgus) since the most comprehensive molecular phylogeny of the subfamily to date (Šípek et al., 2016) placed Valgini, and Trichiini in part, as sister-group to the remaining 10 tribes that they sampled.


We obtained DNA barcode data from 284 specimens, of which 256 were adults (90%) and 28 were larvae. We sampled 68 described species and up to 27 putative undescribed species at an average of 3 specimens per species. Our total of 68 described species includes an unidentified species of Microvalgus which is likely to be a described species. All taxa are represented by at least one adult and none are represented only by larvae.

Two hundred and forty-five sequences (86%) are BARCODE standard compliant, defined as >486 bp in length, with two or fewer ambiguous bases and with at least two high-quality sequence trace files uploaded. Only six sequences were less than 300 bp in length.

Mean specimen age at DNA extraction was 4.2 years, although for the first batch of 94 samples the mean age was 7.4 years. The oldest sample to yield barcode-standard compliant data was 22.6 years old.

Bayesian Inference was completed after 18,625,000 generations when the average standard deviation of split frequencies reached 0.009997. The structure of the BI tree is summarized in Fig. 1, with strongly supported branches (posterior probabilities (PP) ≥ 0.99 and bootstrap percentages (BP) from the RAxML ≥ 95%) indicated by asterisks. The complete BI tree is shown in Figs. 27 and the complete RAxML tree is provided as Fig. S1.

Bayesian phylogenetic tree for all data.

Figure 1: Bayesian phylogenetic tree for all data.

Branches are collapsed to illustrate genus-level relationships. Microvalgus was treated as the outgroup. Asterisks indicate nodes with strong support from both Bayesian posterior probabilities (PP ≥ 0.99) and maximum likelihood bootstrap percentage (BP ≥ 95). Closed circles indicate nodes with strong support under only one of these methods.
Complete Bayesian tree, part 1 of 6.

Figure 2: Complete Bayesian tree, part 1 of 6.

Support values (posterior probabilities) are shown at nodes only if ≥0.70. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.
Complete Bayesian tree, part 2 of 6.

Figure 3: Complete Bayesian tree, part 2 of 6.

Support values (posterior probabilities) are shown at nodes only if ≥0.70. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.
Complete Bayesian tree, part 3 of 6.

Figure 4: Complete Bayesian tree, part 3 of 6.

Support values (posterior probabilities) are shown at nodes only if ≥0.70. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.
Complete Bayesian tree, part 4 of 6.

Figure 5: Complete Bayesian tree, part 4 of 6.

Support values (posterior probabilities) are shown at nodes only if ≥0.70. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.
Complete Bayesian tree, part 5 of 6.

Figure 6: Complete Bayesian tree, part 5 of 6.

Support values (posterior probabilities) are shown at nodes only if ≥0.70. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.
Complete Bayesian tree, part 6 of 6.

Figure 7: Complete Bayesian tree, part 6 of 6.

Support values (posterior probabilities) are shown at nodes only if ≥0.70. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.

Eleven genera were represented by a single species in our data set, including seven monotypic genera (Phyllopodium Schoch, 1895, Octocollis Moeseneder & Hutchinson, 2012, Lenosoma Kraatz, 1880, Stenopisthes Moser, 1913, Hemipharis Burmeister, 1842, Neoclithria Van de Poll, 1886, Micropoecila Kraatz, 1880) and four additional genera (Mycterophallus Van de Poll, 1886, Poecilopharis Kraatz, 1880, Evanides Thomson, 1880, Storeyus Hasenpusch & Moeseneder, 2009). In all six cases where these species had multiple samples, the species were recovered as monophyletic and distinct from other species.

Of the remaining 22 genera, for which multiple species were sampled, half were recovered as monophyletic. These are listed with the number of species sampled and number of specimens (n) sampled in parentheses: Microvalgus (4 spp., n = 13), Ischiopsopha Gestro, 1874 (2 spp., n = 8), Lomaptera Gory & Percheron, 1833 (2 spp., n = 5), Schizorhina (2 spp., n = 6), Navigator (Moeseneder & Hutchinson, 2016) (2 spp., n = 5), Lyraphora Kraatz, 1880 (3 spp., n = 11), Tapinoschema Thomson, 1880 (3 spp., n = 12), Bisallardiana (10 spp., n = 31), Neorrhina Thomson, 1878 (2 spp., n = 11), Chlorobapta (3 spp., n = 11) and Metallesthes Kraatz, 1880 (4 spp., n = 16).

RESL cluster analysis grouped sequences into 100 OTUs, with 32 of these being singletons. There were 21 singleton species, and the remaining 11 singleton OTUs represented divergent lineages within species. RESL clustering split 13 species, some of them into as many as 4 OTUs, as summarised in Table 2.

Table 2:
Results of RESL clustering for species showing >2% maximum uncorrected intraspecific distance.
Species Number of specimens analysed Maximum uncorrected intraspecific distance Number of RESL OTUs Number of BOLD BINs
Aphanesthes pullata 3 6.20% 2 2
Chondropyga dorsalis 12 6.09% 4 4
Glycyphana (Glycyphaniola) brunnipes 4 5.61% 2 2
Glycyphana (Glycyphaniola) stolata 8 5.52% 4 4
Neorrhina punctatum 7 3.27% 2 2
Micropoecila cincta 2 4.84% 2 2
Dilochrosis brownii 7 4.20% 2 2
Lyraphora obliquata 4 3.76% 2 2
Aphanesthes succinea 4 2.51% 2 2
Chondropyga sp_cmoo_chm 2 2.87% 2 2
Eupoecila australasiae 7 2.71% 1 2
Metallesthes anneliesae 7 2.24% 2 2
Dilochrosis balteata 3 2.15% 2 1
DOI: 10.7717/peerj.9348/table-2

Two BINs contained multiple species. Firstly, the BIN containing Hemichnoodes mniszechi (Janson, 1873), H. parryi (Janson, 1873) and Diaphonia sp_dnul_chm (Fig. S1). However, D. luteola, placed in the same cluster in both trees, was not included in the BIN analysis. In the separate RESL clustering analysis, these four species were recovered as separate OTUs. Secondly, the BIN containing both Glycyphana (Caloglycyphana) papua (Wallace, 1867) and G. (Caloglycyphana) pulchra (Macleay, 1871) (maximum within-OTU distance = 1.57%).

The Geographic Distance Correlation test was significant (p ≤ 0.05) for only three species (Ischiopsopha wallacei (Thomson, 1857), Metallesthes anneliesae Moeseneder, Hutchinson & Lambkin, 2014, Glycyphana stolata) and highly significant (p ≤ 0.01) for a single species, Chondropyga dorsalis.

Larval specimens, indicated by “L” after the species name in all Figures, were reared progeny from mated adult specimens, and were placed with the correct species. There was a single wild-collected larva (MIC60567-002) and it was identified by barcoding as Hemichnoodes mniszechi.


This preliminary study reports DNA barcode data for 68 described species from 33 genera, representing 48% of currently known Australian species and 83% of the genera (141 described species in 40 genera; Moeseneder et al., 2019; Hutchinson & Moeseneder, 2019). Our goal is a comprehensive DNA barcode dataset, and complementary nuclear gene and morphological data, to address both species-level and higher-level relationships of the Australian cetoniines, facilitating integrative revisionary taxonomy. Here we recognise likely undescribed species and note cases of likely generic misassignment of species but refrain from making taxonomic decisions, as that would require careful and comprehensive generic revisions, which are beyond the scope of the current study.

In general, there was concordance between morphology-based identifications and barcode-based clustering. This concordance is not obvious since RESL clustering split many species and produced 100 OTUs. However, our preliminary morphological investigations suggest that in addition to the 68 described species we sampled, the 100 OTUs include up to 27 undescribed species.

Of the 27 possible undescribed species, five were known to us previously and are easily distinguished morphologically, six were suspected but with some uncertainty due to their similarity to described species, and 16 were completely unexpected (potential “cryptic species”) and were only revealed by their DNA barcodes. Their morphological similarity to described species is striking, and further work, including analysis of nuclear genes (e.g., Raupach et al., 2010) and male genitalia from a larger series of specimens, is needed to rigorously assess their taxonomic status. The number of undescribed species hence may represent a potential increase to the size of the Australian fauna of 12–19%.

There was one OTU that contained more than one species: Glycyphana (Caloglycyphana) pulchra plus G. (Caloglycyphana) papua, however, these species had 1.57% distance between them and were clearly separated in the trees.

While Barcode Index Numbers (BINs) are calculated by BOLD using the RESL clustering algorithm, sequences on BOLD must meet criteria such as minimum sequence length and quality to be included in a BIN, thus only 252 sequences were placed into BINs. We therefore also performed a separate RESL clustering analysis on the complete 284 sequence dataset to obtain OTUs. The three differences between these analyses were: (1) both species of Hemichnoodes Kraatz, 1880, plus Diaphonia sp dnul chm were assigned to a single BIN (D. luteola (Janson, 1873) was not assigned to any BIN), while the cluster analysis split these four taxa into separate OTUs corresponding to their morphological identification. Relationships among these taxa are discussed below. (2) Dilochrosis balteata was placed in a single BIN but split into two OTUs with 2.15% distance between them. (3) Eupoecila australasiae was divided into two BINs with 2.71% distance between them, but comprised a single OTU.

The significant Geographic Distance Correlation tests, on the whole, reflect sampling of very widely separated populations. For Ischiopsopha wallacei, this reflects the separation of samples from Sabai and Dauan Islands, within 5 km of Papua New Guinea, and samples from approximately 800km south in Queensland. Glycyphana stolata samples were collected from Dauan Island to the Brisbane region >2,200 km to the south. In Metallesthes anneliesae, the pattern is more subtle as the seven specimens were collected within an 80 km radius of each other, some 200 km west of Brisbane, and the most distinct sequence, a separate BIN, is one on the northwestern perimeter of the samples’ distribution. The only highly significant test result was for Chondropyga dorsalis, where the 12 specimens were collected within only 70 km of each other in Southeast Queensland in varying habitat types. An attempt at finding unique, easily visible characters for each group is ongoing.

While we do not expect a small and rapidly evolving fragment of a single mitochondrial gene to yield a robust phylogeny of the Cetoniinae, phylogenetic analysis of DNA barcodes is likely to give a good indication of relationships among closely related species, to provide a guide to where undescribed taxa should be placed, and suggest where further evidence is needed on supraspecific relationships. The discussion below is meant in that context, acknowledging the limited deeper phylogenetic utility of DNA barcodes.

Microvalgus is a diverse genus (approximately 51 described species worldwide, 16 in Australia) and poorly studied in Australia. We sampled four species, one of which we could not definitively identify and have called Microvalgus sp. mvalg4 chm (Fig. 2). Based on current results, we expect DNA barcoding to be useful for revising this group in the future.

In the most well-known Australian cetoniine species, Eupoecila australasiae, Neorrhina punctatum, Glycyphana stolata, Chondropyga dorsalis and Bisallardiana gymnopleura, we found high levels of DNA diversity. While this is not unusual for DNA barcoding studies, e.g., in Elateridae (Oba et al., 2015) and stemborer moths (Lee et al., 2019), our preliminary morphological examination of the species implies that these high levels of COI diversity are for the most part correlated with morphological diversity. This suggests that many of these OTUs may in fact represent undescribed species. Further cases of discordance between prior expectations based on current taxonomy and DNA barcoding results are detailed below.

Trichaulax (4 spp., n = 11) was rendered paraphyletic by the insertion of Lenosoma fulgens (1 spp., n = 3) (Fig. 3). Chondropyga (4 spp., n = 20) was rendered paraphyletic by the insertion of Pseudoclithria hirticeps (Macleay, 1871) (1 sp., n = 1) (Fig. 3). Pseudoclithria hirticeps, the type species of Pseudoclithria, is placed incorrectly and likely belongs in genus Chondropyga. However, as we sampled only a single specimen of P. hirticeps this result requires confirmation with data from further specimens and genes.

The lineage containing Dilochrosis (4 spp., n = 27) had Glycyphana pulchra/G. papua (2 spp., n = 4) embedded within it in the Bayesian tree (Fig. 4). The RAxML tree was similar, except that Protaetia (Protaetia) fusca (Herbst, 1790) (n = 6) was also embedded with Dilochrosis, as sister group to the two Glycyphana species (Fig. S1). However, based on morphological evidence, the length of the branch subtending G. pulchra/G. papua and the instability of these nodes when analysed by maximum likelihood methods, it appears unlikely that these placements reflect true phylogenetic affinities, and further evidence is needed to resolve these questions.

Glycyphana was consistently split into two distantly related groups, one containing the closely related G. (Caloglycyphana) pulchra and G. (Caloglycyphana) papua, merged into a single BIN (Fig. 4), and the other containing G. (Glycyphaniola) brunnipes (Kirby, 1818) and G. (Glycyphaniola) stolata (Figs. 5 and 6). If confirmed by future phylogenetic analysis of nuclear genes, this interesting result could require the elevation of one of the subgenera to a separate genus. Glycyphana brunnipes is split into two BINs while G. stolata is split into four BINs. Bacchus (1974) split G. stolata into two forms. Substantial further integrative taxonomic work is required to reassess species boundaries in these species complexes.

Neoclithria (1 sp., n = 3) is embedded within Clithria (3 spp., n = 7) (Fig. 6), and Micropoecila (1 sp., n = 2) is embedded within Eupoecila (3 spp., n = 12) (Fig. 6). Thus, both Neoclithria and Micropoecila may need to be synonymised with the genera they are placed within.

Relationships among Diaphonia, Aphanesthes Kraatz, 1880, Hemichnoodes, Pseudoclithria and Metallesthes were complex (Figs. 6 and 7). There was moderate to strong support, a Bayesian posterior probability (PP) of 0.99 and maximum likelihood bootstrap percentage (BP) of 65%, for a clade including Aphanesthes succinea (Hope, 1844) (n = 4), Diaphonia (3 spp., n = 6) and Hemichnoodes (2 spp., n = 6). There was weaker support (PP of 0.95, BP < 50%) for the sister-group to the above clade, comprising A. pullata (Janson, 1873) (n = 3), A. sp_aisa chm (a possible undescribed species, n = 1), Pseudoclithria (5 spp., n = 13) excluding P. hirticeps (mentioned above) and Metallesthes.

Aphanesthes sp_aisa_chm appears to share similarities with Aphanesthes pullata and A. trapezifera (the latter species was not DNA barcoded). Hence its placement within Diaphonia, albeit without statistical support, is unexpected and its affinities might be better resolved by nuclear gene data. In contrast, Aphanesthes succinea has several characters not shared with other described Aphanesthes and we expected it to be separated from congeners in the trees. However, where it is placed in our phylogeny, with Diaphonia xanthopyga, appears questionable and this will require a close examination of morphological characters. The grouping of Diaphonia luteola and D. sp_dnul_chm with Hemichnoodes is surprising as well because the male genital construction in Hemichnoodes is unique. In the next phase of the molecular project we intend to sample further genes and Australian taxa to assist in resolving these questions.

The remaining seven genera that constitute the Australian cetoniine fauna were not sampled because no recent material was available for DNA sequencing. These are Aurum Hutchinson & Moeseneder, 2019, Axillonia Krikken, 2018, Grandaustralis Hutchinson & Moeseneder, 2013, Macrotina Strand, 1934, Territonia Krikken, 2018, Chalcopharis Heller, 1903 and Charitovalgus Kolbe, 1904. The first four of these genera are monospecific and the last two are represented in Australia by a single species each.

In the absence of either nuclear gene data or robust morphological studies of more specimens, we stopped short of drawing firm conclusions about species boundaries in this study. This is because COI-based barcoding can overestimate the number of species in widely dispersed taxa (Klimov, Skoracki & Bochkov, 2019) due to the effects of incomplete lineage sorting, and nuclear inserts of mtDNA fragments (NUMTs). Also, Wolbachia infection can complicate COI-based species delimitation, through creating cytoplasmic incompatibility, or by introgression and selective sweeps (Smith et al., 2012).

Despite the uncertainties mentioned above, the barcode fragment of COI yields an unexpectedly robust tree topology. This suggests that complete mitochondrial genomes would provide useful data for analysing Cetoniinae phylogeny. Given that complete mitochondrial genomes plus complete nuclear ribosomal cistrons can be obtained by genome skimming (Coissac et al., 2016) we suggest that approach would be a profitable strategy for further investigation of both the phylogeny of Cetoniinae and species delimitation.

Once a DNA barcode library has been established for a given taxon, there are many possible applications, including identifying field-collected larvae to uncover species biology, identifying pests, biodiversity assessment and species monitoring, untangling food webs, and so on (Mitchell, 2008). In addition to barcoding representative larvae reared from controlled matings between collected adults, we also applied barcode data to identify an unknown larva, which turned out to be Hemichnoodes mniszechi. This approach to larval identification also makes larval morphological characters now accessible for description and diagnosis.

Development and refinement of DNA barcode libraries facilitates ecological study by anchoring environmental DNA datasets and linking them with robust taxonomy. Such metabarcoding studies may soon revolutionize modern biodiversity surveys (Ruppert, Kline & Rahman, 2019) and robust DNA barcode libraries underpin that potential.


We produced a DNA barcode dataset for Australian flower beetles that includes approximately half of the country’s species. We found that DNA barcodes provide species-level resolution in almost all cases. The high levels of DNA diversity were unexpected within many species, and preliminary morphological investigations suggest that there may be as many as 27 undescribed species in our dataset. Further integrative taxonomic work, incorporating COI-based DNA barcoding, nuclear gene data and detailed morphological investigations, are needed to better understand the diversity of Australian Cetoniinae and to document and describe numerous undescribed species.

Supplemental Information

Complete phylogenetic tree from Maximum Likelihood analysis of 284 sequences

Sequence names include SampleID, species and “L” if the specimen is a larva. BOLD BINs are indicated in blue boxes on the right; dotted lines indicate sequences that were not included in the BIN analysis, but are placed in the same OTU by the separate RESL cluster analysis. Support values (bootstrap percentages) are shown at nodes only if ≥ 50. Closed circles indicate taxa whose placement in the tree was unexpected, rendering another genus paraphyletic.

DOI: 10.7717/peerj.9348/supp-1

Sample accession numbers and collection data

DOI: 10.7717/peerj.9348/supp-2