A DNA barcode library for ground beetles of Germany : the genus Agonum Bonelli , 1810 ( Insecta , Coleoptera , Carabidae )

The ground beetle genusAgonumBonelli, 1810 is a large genus of the tribe Platynini with many species that show high amounts of intraspecific variations, making a correct identification challenging. As part of the German Barcode of Life initiative, this publication provides a comprehensive DNA barcode library for species ofAgonumthat are reported for Germany. In total, DNA barcodes from 258 beetles and 23 species were analysed using the Barcode of Life Data System (BOLD) workbench, including sequences from former studies and 68 newly-generated sequences. The neighbour-joining analyses, based on K2P distances, revealed distinct clustering for all studied species, with unique Barcode Index Numbers (BINs) for 15 species (65%). BIN sharing but distinct clustering was found for three species pairs:Agonum micans/Agonum scitulum,Agonum impressum/Agonum sexpunctatumandAgonum duftschmidi/Agonum emarginatum. The given dataset and its analysis represent another important step in generating a comprehensive DNA barcode library for the ground beetles of Germany and Central Europe in terms of modern biodiversity research.


Introduction
Species identification represents a pivotal component for biodiversity studies and conservation planning, but represents a challenge for many taxa when using morphological traits only (e.g. the correct identification of juveniles or larval stages). As a consequence of tremendous technological advances in molecular biology during the last 20 years, molecular data have become increasingly popular in species identification. In this context, DNA barcoding represents the central component in the modern diagnostic toolbox of molecular biodiversity assessment studies and taxonomic research (e.g. Hebert and Gregory 2005;Krees et al. 2015;Hajibabaei et al. 2016;Miller et al. 2016). For animals, a 658 base pair (bp) fragment of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been selected as a DNA barcode (Hebert et al. 2003a, b). The utility of DNA barcoding relies on the assumption that genetic variation within a species is much smaller than variation between species. Barcode sequences are typically deposited in the international Barcode of Life Data Systems database (BOLD; http://www.boldsystems.org). This public database acts as the central core data interface and repository that allows researchers to collect, to organise and to analyse DNA barcode data (Ratnasingham and Hebert 2007). In addition to various analytical tools implemented in the BOLD workbench, DNA barcodes can be analysed using the Barcode Index Number (BIN) system that clusters DNA barcodes to produce operational taxonomic units that closely correspond to species (Ratnasingham and Hebert 2013).
Nevertheless, it should be noted that various effects can seriously limit the use of DNA barcoding or mitochondrial markers in general, in terms of molecular species identification (e.g. Will and Rubinoff 2004;Rubinoff et al. 2006;Collins and Cruickshank 2013;Duran et al. 2019). For example, closely related but distinct species, as well as species that hybridise, may share identical haplotypes (e.g. Sota and Vogler 2002;Takami and Suzuki 2005;Andujar et al. 2014). In contrast to this, complex phylogeographic histories (e.g. Schoville et al. 2012;Faille et al. 2015;Weng et al. 2016) and incomplete lineage sorting effects (e.g. Zhang et al. 2005;Weng et al. 2020) can contort the mitochondrial variability of the studied organisms, generating para-and polyphyly in phylogenetic trees (Funk and Omland 2003;Ross 2014; but see Mutanen et al. 2016). Heteroplasmy (Boyce et al. 1989) or the presence of mitochondrial pseudogenes (Hazakani-Covo et al. 2010;Maddison 2012) can affect a successful amplification of the target fragment. Finally, maternally-inherited endosymbionts, such as the alpha-proteobacteriae Wolbachia, may cause a linkage disequilibrium within mtDNA in arthropods and result in a homogenisation of mtDNA haplotypes (e.g. Hurst and Jiggins 2005;Kolasa et al. 2018;Kajtoch et al. 2019). However, a vast number of studies across a broad range of taxa demonstrate the efficiency of DNA barcoding as the method of choice in molecular species identification (e.g. Raupach et al. 2014;Schmidt et al. 2015;Morinière et al. 2017;Schmid-Egger et al. 2019).
Not surprisingly, the build-up of comprehensive DNA barcode libraries represents a pivotal task (e.g. Brandon-Mong et al. 2015;Curry et al. 2018;Weigand et al. 2019). In Germany, the German Barcode of Life initiative (GBoL; www.bolgermany.de) aims to assess the genetic diversity of all animals, fungi and plants of Germany. Numerous studies provided first comprehensive DNA barcode libraries for various insect taxa (e.g. Hausmann et al. 2011;Hendrich et al. 2015;Hawlitschek et al. 2017;Havemann et al. 2018). In the case of ground beetles (Carabidae), a number of previous studies started to build up a comprehensive DNA barcode library for these routinely-used biological indicators, including the genera Bembidion Latreille, 1802 (Raupach et al. 2011;Raupach et al. 2016), Amara Bonelli, 1810  and Notiophilus Duméril, 1806 (Raupach et al. 2019). Detailed studies of many other important genera, however, are still missing.
Within the Carabidae, the genus Agonum Bonelli, 1810 is a species-rich taxon of shiny black or metallic ground beetles that imitate large representatives of Bembidion Latreille, 1802 (Lindroth 1986) (Fig. 1). Beetles of this genus are commonly found in forested and open habitats adjacent to freshwater throughout Europe, the eastern Palearctic area and North America Bousquet 2012). They are generalist predators that feed on a wide range of small arthropods, for example, springtails, aphids or midges (e.g. Griffiths et al. 1985;Bilde and Toft 1994;Hannam et al. 2008).
A first detailed phylogenetic analysis of all species of the genus Agonum that was based on numerous morphological characters recognised four subgeneric entities: Agonum sensu stricto, Europhilus Chaudoir, 1859, Olisares Motschulsky, 1865 and Platynomicrus Casey, 1920 Liebherr et al. 2005). Whereas some species are distinctive, others are difficult to identify as a consequence of intraspecific variations in setation, pronotal shape, body colouration and size and, therefore, typically require the examination of the genitalia (Hůrka 1996;Luff 2007). In the case of the Agonum species that are documented for Germany, the combination of various characteristic traits, including, besides others, a well-developed median tooth at the mentum and the shape of the posterior angles of the pronotum, allows a correct identification (Schmidt, pers. communication). In Europe, species related to Agonum viduum are most challenging in terms of identification, but this group has been carefully revised more than 25 years ago (Schmidt 1994). For Germany, 24 species have been recorded so far (Trautner et al. 2014;Schmidt et al. 2016). As a consequence of many Agonum species being found in highly-endangered peat bogs or similar freshwater-associated habitats, numerous species have become very rare or even threatened with extinction, for example Agonum hypocrita (Apfelbeck, 1904) or Agonum munsteri (Hellén, 1935) (Schmidt et al. 2016).
In this study, we present, as part of the on-going GBoL project, another step in generating a comprehensive DNA barcode library for the molecular identification of Central European ground beetle species, here focusing on the genus Agonum. The analysed sequence library included 23 species of Agonum and Oxypselaphus obscurus (Paykull, 1790) as outgroup. We generated 68 new barcodes including some sequences of old pinned museum specimens and analysed a total number of 258 DNA barcodes in detail.

Sampling of specimens
Most of the sampled ground beetles (n = 186, 78%) were collected between 1999 and 2015 using various sampling methods (i.e. hand collecting, pitfall traps). All beetles were stored in ethanol (96%). The analysed specimens were identified using the identification key provided in Schmidt (2006). It was also possible to generate DNA barcodes from decade-old pinned ground beetles of the carabid collection of the Bavarian State Collection of Zoology, namely Agonum antennarium (Duftschmid, 1812) (n = 2) with an age of 87 years and A. impressum (Panzer, 1796) (n = 1, age: 52 years). In total, 68 new barcodes of 17 species were generated, including five species new to BOLD. For our analysis, we also included 175 DNA barcodes of four previous studies (Raupach et al. 2010: 17 specimens, 5 species;Hendrich et al. 2015: 101 specimens, 13 species; Pentinsaari et al. 2014: 52 specimens, 12 species;  Rulik et al. 2018: 5 specimens, 1 species), as well as 15 public barcodes deposited in BOLD without corresponding publication (10 species).
DNA barcode amplification, sequencing and data depository All laboratory operations were carried out, following standardised protocols for COI amplification and sequencing (Ivanova et al. 2006;deWaard et al. 2008), at the Canadian Center for DNA Barcoding (CCDB), University of Guelph, the molecular labs of the Zoologisches Forschungsmuseum Alexander Koenig (ZFMK) in Bonn and the working group Systematics and Evolutionary Biology at the Carl von Ossietzky University Oldenburg, the latter two being located in Germany. Photos from each studied beetle were taken before molecular work started. One or two legs of one body side were removed for the subsequent DNA extraction which was performed using the QIAmp Tissue Kit (Qiagen GmbH, Hilden, Germany) or NucleoSpin Tissue Kit (Macherey-Nagel, Düren, Germany), following the extraction protocol.
Detailed information about used primers, PCR amplification and sequencing protocols can be found in a previous publication (see Raupach et al. 2016). All purified PCR products were cycle-sequenced and sequenced in both directions by contract sequencing facilities (Macrogen, Seoul, South Korea or GATC, Konstanz, Germany), using the same primers as used in PCR. Double-stranded sequences were assembled and checked for mitochondrial pseudogenes (numts) by analysing the presence of stop codons and frameshifts, as well as double peaks in chromatograms with the Geneious Prime 2020.0.4 (https:// www.geneious.com) (Biomatters, Auckland, New Zealand). For verification, BLAST searches (nBLAST, search set: others, programme selection: megablast) were conducted to confirm the identity of all new sequences based on already published sequences (high identity values, very low E-values).
Detailed voucher information, taxonomic classifications, photos, DNA barcode sequences, primer pairs used and trace files (including their quality) were uploaded to the public dataset "DS-BAAGO" (Dataset ID: dx.doi.org/10.5883/ DS-BAAGO) on the Barcode of Life Data Systems (BOLD; www.boldsystems.org) (Ratnasingham and Hebert 2007). Parallel to this, all new barcodes were deposited in Gen-Bank (accession numbers: MT520822-MT520889).

DNA Barcode analysis
The complete dataset was analysed by using various approaches. First, the comprehensive analysis tools of the BOLD workbench were employed to calculate the nucleo tide composition of the sequences and distributions of Kimura-2-parameter distances (K2P; Kimura 1980) within and between species (align sequences: BOLD aligner; ambiguous base/gap handling: pairwise deletion). In addition, all barcode sequences were subjected to the Barcode Index Number (BIN) analysis system implemented in BOLD that clusters DNA barcodes in order to produce operational taxonomic units that typically closely correspond to species (Ratnasingham and Hebert 2013). A threshold of 2.2% was applied for a rough differentiation between intraspecific and interspecific distances, based on Ratnasingham and Hebert (2013). These BIN assignments on BOLD are constantly updated as new sequences are added, splitting and/or merging individual BINs in the light of new data.
Second, maximum parsimony networks were constructed with TCS 1.21, based on default settings (Clement et al. 2000) and implemented in the software package PopART v.1.7 (Leigh and Bryant 2015), in the case of species pairs with interspecific distances smaller than 2.2% and sharing identical BINs. Such networks allow a better visualisation of the distances between closely-related species than classical tree topologies.
Finally, all sequences were aligned using MUSCLE (Edgar 2004) and analysed using a neighbour-joining cluster analysis (NJ; Saitou and Nei 1987), based on K2P distances with MEGA 10.0.5 (Kumar et al. 2018). Non-parametric bootstrap support values were obtained by re-sampling and analysing 1,000 pseudoreplicates (Felsenstein 1985). It should be noted that DNA barcodes do not aim to recover phylogenetic relationships (e.g. DeSalle and Goldstein 2019). Instead, the shown topology represents a graphical visualisation of DNA barcode distance divergences and putative species cluster.

Results
Overall, 252 DNA barcode sequences of 24 Agonum species were analysed, representing 96% of all documented species (n = 25) of this genus for Germany, except Agonum nigrum Dejean, 1828. A full list of the analysed species is presented in the supporting information (Suppl. material 1). Lengths of the analysed DNA barcode fragments ranged from 307 to 658 base pairs (bp). As is typically known for arthropods, the DNA barcode region has a high AT-content (68%), with mean sequence compositions for Adenosine (A) = 31%, Cytosine (C) = 16%, Guanine (G) = 16% and Tyrosine (T) = 37%. Intraspecific K2P distances ranged from 0 to a maximum of 1.3% (Agonum ericeti (Panzer, 1809)), whereas interspecific distances had values between 0.16 and 7.07%. Lowest interspecific distances were found for Agonum micans Figure 2. Maximum statistical parsimony networks of the species pairs. A. Agonum micans (Nicolai, 1822) (blue) and Agonum scitulum Dejean, 1828 (red); B. Agonum impressum (Panzer, 1796) (yellow) and Agonum sexpunctatum (Linne, 1758) (green) and C. Agonum duftschmidi Schmidt, 1994 (violet) and Agonum emarginatum (Gyllenhal, 1827) (light brown). Used parameters included default settings for connection steps, gaps were treated as fifth state. Each line represents a single mutational change, whereas small black dots indicate missing haplotypes. The numbers of analysed specimens (n) are listed, whereas the diameter of the circles is proportional to the number of haplotypes sampled (see given open circles with numbers). Scale bars: 1 mm. Beetle images were obtained from http://www.eurocarabidae.de. (Nicolai, 1822) and Agonum scitulum Dejean, 1828) with a value of 0.16% and the same BIN (AAN9978).
The NJ analyses, based on K2P distances, revealed non-overlapping clusters for all analysed species (Fig. 3).
In the case of species with more than two analysed specimens (n = 22), 82% (n = 18) of the studied species were characterised with bootstrap support values > 95%. A more detailed topology of all analysed specimens is presented in the supporting information (Suppl. material 2).

Discussion
The results of this study highlight the efficiency of DNA barcodes for the determination of most German species of the genus Agonum, but also indicate a close relationship of a number of species with shared BINs, for example, Agonum scitulum (Dejean, 1828) and Agonum micans (Nicolai, 1822) (Suppl. material 1, Fig. 2). Haplotypes of both species are separated by a K2P distance of 0.16 and only three mutational steps ( Fig. 2A). Despite the fact that Agonum scitulum has been often overlooked and misidentified as Agonum micans as a result of flawed identification keys in the past (e.g. Paill 2010; Schmidt and Benedikt 2010;Brigić et al. 2016), such a close relationship between these two species had, however, not been expected before (see . Agonum scitulum (Dejean, 1828) is a macropterous Western Palaearctic species with a discontinuous distribution from England to Romania, Croatia and the European part of Russia (Luff 2007;Paill 2010;Schmidt and Benedikt 2010;Brigić et al. 2016). It is a very rare species that is typically found in marshes, fens and reed beds, syntopic with Agonum micans and Agonum fuliginosum (Panzer, 1809) (Luff 2007;Paill 2010;Schmidt and Benedikt 2010). Thanks to a thorough review some years ago, an excellent key for the genus Agonum has been established and allows a reliable identification of both species, i.e. the absence/presence of hairs on the dorsal site of the 3 rd tarsomere of the last walking leg (Schmidt 2006). Low molecular distances between both species give evidence for an apparent recent separation, but the small number of studied specimens, as well as the only use of mitochondrial sequence data, does not allow more conclusions at the moment. Here, additional analysis combining a careful morphological analysis, as well as detailed fine-scaling nuclear sequence data, will provide more information.
In contrast to the previous species pair, a close relationship between Agonum impressum (Panzer, 1796) and Agonum sexpunctatum (Linne, 1758), as well as Agonum duftschmidi Schmidt, 1994 andAgonum emarginatum (Gyllenhal, 1827), has been already suggested in the past . The given barcode data support this hypothesis. Similar to the Agonum micans and Agonum scitulum, haplotypes of Agonum impressum (Panzer, 1796) and Agonum sexpunctatum (Linné, 1758) are separated by three mutational steps (K2P distance: 0.36) (Fig. 2B), whereas the minimum K2P distance of Agonum duftschmidi Schmidt, 1994 andAgonum emarginatum (Gyllenhal, 1827) has a value of 0.77, resulting in five additional mutational steps (Fig. 2C). Only the analysis of additional specimens from different locations will show if these distances within both species pairs are sound.
At this point it should be kept in mind that not all beetles were collected in Germany. For instance, all specimens of Agonum antennarium (Duftschmid, 1812) were sampled in Montenegro (see Suppl. material 2), whereas beetles of other species were sampled from different localities in Germany only (e.g. Agonum lugens (Duftschmid, 1812) or Agonum micans (Nicolai, 1812)). As previously mentioned, a number of species are very rare in the target area, but more abundant in adjacent countries. In most cases, the new DNA barcodes represent, however, the very first molecular data for these taxa and give important impressions about their molecular diversity, even if they were not sampled in Germany. Nevertheless, it is also planned to analyse specimens for such species that were collected in Germany in the near future.

Conclusion
As central part of modern biodiversity research, DNA barcoding will become more and more prominent in this research field. Therefore, comprehensive DNA barcode libraries of important bio-indicators will become the backbone of any applied analysis, for example, metabarcoding as well as eDNA studies. The present study demonstrates the successful identification of most Agonum species documented for Germany by using DNA barcodes. The dataset also shows a close relationship between various species and their putative recent origin. Only the analysis of additional species and specimens, however, will reveal if the observed patterns of distinct clusters persist.

Acknowledgements
We would like to thank Christina Blume and Claudia Etzbauer (both ZFMK, Bonn) for their laboratory assistance. Furthermore, we are grateful to Frank Köhler (Bonn) for providing various specimens and to Ortwin Bleich for giving permission to use the excellent photos of ground beetles taken from www.eurocarabidae.de. Daniel Duran and Joachim Schmidt provided helpful comments on the manuscript. This publication was partially financed by German Federal Ministry for Education and Research (FKZ01LI1101A, FK Z01LI1101B, FKZ03F0664A), the Land Niedersachsen and the German Science Foun dation (INST427/1-1), as well as by grants from the Bavarian State Government (Barcoding Fauna Bavarica) and the German Federal Ministry of Education and Research (GBOL1, GBOL2, GBOL3: 01LI1901B). We are grateful to the team of Paul Hebert in Guelph (Ontario, Canada) for their great support and help and, in particular, to Sujeevan Ratnasingham for developing the BOLD database infrastructure and the BIN management tools. Sequencing work was partly supported by funding from the Government of Canada to Genome Canada through the Ontario Genomics Institute, whereas the Ontario Ministry of Research and Innovation and NSERC supported development of the BOLD informatics platform.