Research Article |
Corresponding author: Xin-Ran Li ( conlinmccat@foxmail.com ) Academic editor: Harald Letsch
© 2022 Xin-Ran Li.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Li X-R (2022) Phylogeny and age of cockroaches: a reanalysis of mitogenomes with selective fossil calibrations. Deutsche Entomologische Zeitschrift 69(1): 1-18. https://doi.org/10.3897/dez.69.68373
|
In spite of big data and new techniques, the phylogeny and timing of cockroaches remain in dispute. Apart from sequencing more species, an alternative way to improve the phylogenetic inference and time estimation is to improve the quality of data, calibrations and analytical procedure. This study emphasizes the completeness of data, the reliability of genes (judged via alignment ambiguity and substitution saturation), and the justification for fossil calibrations. Based on published mitochondrial genomes, the Bayesian phylogeny of cockroaches and termites is recovered as: Corydiinae + (((Cryptocercidae + Isoptera) + ((Anaplectidae + Lamproblattidae) + (Tryonicidae + Blattidae))) + (Pseudophyllodromiinae + (Ectobiinae + (Blattellinae + Blaberidae)))). With two fossil calibrations, namely, Valditermes brenanae and Piniblattella yixianensis, this study dates the crown Dictyoptera to early Jurassic, and crown Blattodea to middle Jurassic. Using the ambiguous ‘roachoid’ fossils to calibrate Dictyoptera+sister pushes these times back to Permian and Triassic. This study also shows that appropriate fossil calibrations are rarer than considered in previous studies.
Blattaria, Blattodea, Dictyoptera, divergence time, mitochondrial DNA
The family-level relationships of cockroaches have been in dispute for decades (Fig.
Representative phylogenetic inferences of cockroaches based on various data and methods.
Calibration has a major impact on divergence time estimation (
Phylogenetic inference and time estimation can be improved by enlarging the dataset with new loci and new samples, but also by improving the quality of published data, calibrations and analytical procedure. The latter approach is emphasized and presented herein. In the present study, the mitochondrial genome is preferred as the only type of data for the following reasons. First, taxon coverage is comparatively high; second, missing data can be essentially avoided; third, the computation load (i.e. time investment) is acceptable, allowing multiple analyses for comparisons among datasets.
The present study focuses on true cockroaches (Blattaria), the major component of Dictyoptera. Taxa included in my analyses also cover other Dictyoptera, namely, termites (Isoptera) and mantises (Mantodea), and the living sister of Dictyoptera, namely Eukinolabia + Xenonomia, as suggested by transcriptome data (
The initial data pool comprises 169 mitogenomes, including all available cockroaches and selected other insects (Suppl. material
The character set includes 13 protein coding genes while all RNA genes were excluded. Aligning RNA gene sequences is dependent on the prediction of secondary structure (
Sequences of protein coding genes were aligned using MUSCLE in MEGA7 with default settings of codon mode (
ALIGROOVE 1.05 (
ALIGROOVE suggests high ambiguity in ATP8 alignment, followed by ND6 (Suppl. material
According to the 39 saturation plots (Suppl. material
Phylogenetic inferences were performed in MRBAYES 3.2.7 (
The first analysis utilized all 13 genes. The results are only used for comparison with the second step analyses, to observe the influence of ATP8, ND4L and ND6.
The second step is to compare the trees inferred from three taxon sets. All analyses excluded ATP8, ND4L and ND6. (1) All-species analysis using complete taxon set. (2) Good-species analysis, excluding ‘BadSeq’. (3) Short-species analysis, excluding long-branched taxa detected from the all-species analysis. This step aims to detect the impact of incomplete data and long branch.
The third step analysis used only the ‘safe’ taxa. In this step, all taxa within ‘BadSeq’ were excluded even if they do not virtually affect the topology of other species. It is learnt from experience that more missing or poorly-sequenced bases imply more potential errors in the superficially intact data. Potential pitfalls of incompleteness (e.g. erroneous positions of these taxa per se) violate the main idea of this study. Long-branched taxa with low support are also to be excluded. In the present study, they are Aposthonia borneensis, Aposthonia japonica and Nocticola sp.. ‘Safe’ taxa comprise 85 species (Suppl. material
The fourth, also the final, step yields the phylogeny that is regarded as the formal result. Prior to MRBAYES, sequences of ‘safe’ taxa were re-aligned and concatenated. This new, 9912-base-long alignment, as final dataset, was also imported to ALIGROOVE to assess alignment ambiguity. This 85-species dataset is less ambiguous than the original one (Suppl. material
As calibrations, only two fossils fulfill the criteria of
Some studies used the so-called ‘roachoids’ (Eoblattodea, see
I used the MCMCTREE program in PAML 4.9j (
Trees are visualized using FigTree 1.4.3 (Andrew Rambaut, http://tree.bio.ed.ac.uk/software/figtree) and modified using Adobe Illustrator CC 2017.
For the reader’s convenience and to enable a comparison of studies, familial taxonomy of cockroaches in this paper follows recent studies that are compared (e.g.,
The 13-gene tree recovers a sistergroup relationship between Aposthonia (Embioptera) and Nocticola (Blattaria), which is obviously erroneous regardless of posterior probability (Suppl. material
The ‘safe’-taxa analysis yields higher posterior probabilities (Suppl. material
Bayesian phylogeny of Dictyoptera inferred from ten protein-coding genes of 85 mitogenomes, excluding the third base of codon. Posterior probabilities are shown in percentage otherwise are 100%. Clades of superfamilies or higher rank are numbered, as indicated by black background in the key. Species of major taxonomic identities (all are clades) are coloured, as indicated in the key. Subfamilies of Blattidae and Blaberidae are labeled; asterisked ones are not monophyletic. For comparison with trial analyses, see Suppl. material
The dating result of two-fossil-calibration analysis (without Q. namurensis) is regarded as the formal result of this paper (Fig.
Time trees of Dictyoptera estimated by MCMCTREE. Two-calibration result (middle) is regarded as the formal result of this study. Calibrated nodes are coloured, with vertical bars denoting bounds. Calibrations: Qilianiblatta namurensis (green), Valditermes brenanae (red), Piniblattella yixianensis (blue). Abbreviations: A[naplectidae], Dictyop[tera], L[amproblattidae], T[ryonicidae], Xyloph[agodea]. For detailed phylogenies showing species, see Suppl. material
The relationship of major clades (suborder, superfamily, family, and subfamily) recovered herein is not identical to any previous studies. At the superfamily level, the sistergroup relationship of Corydioidea (only represented by Corydiinae) to the rest of Blattodea is consistent with that in
Corydioidea are always undersampled. Species of Nocticolidae, Latindiinae, and Corydiidae incertae sedis (e.g. Ctenoneura) are lacking. Although one mitogenome of Nocticola is available, it is hardly serviceable unless the long branch is broken up by increased sampling (
In the blattoid complex, only the sistergroup relationship between Cryptocercidae and Isoptera is universally recognized. These taxa constitute Xylophagodea (
The paraphyly of Ectobiidae with respect to Blaberidae is a consensus among studies; the present study is not an exception. However, the relationships among Blaberidae and ectobiid subfamilies are conflicting among studies, especially in the positions of Ectobiinae and Pseudophyllodromiinae. The Ectobiinae contributes a weak point in the new phylogeny (pp = 79%), i.e. the node of Ectobiinae + (Blattellinae + Blaberidae). Regardless of the Nyctiborinae, which is not included in the final phylogeny herein, the sistergroup relationship between Blattellinae and Blaberidae is also supported in
A challenge to all molecular phylogenies is the reconciliation with morphological, behavioral, and other evidence. For example, oothecal property and rotation behavior are various and the taxonomic distribution of them in Blaberoidea is comparatively well known (
The only appropriate fossil calibration in Blattaria in the present study is Piniblattella yixianensis
Criteria 1 and 4. Information about the fossil-bearing stratum and museum collection is provided in
Criterion 2. Regardless of the determination of genus (which is in dispute, see
The reproduction type of P. yixianensis is oviparity B: (1) during reproduction, female cockroaches have a period of carrying the ootheca (if present) outside, but only the oviparity B carries the ootheca externally until hatching; other types only carry shortly before oviposition (oviparity A) or before retraction (ovoviviparity and viviparity) (
However, oviparity B is homoplastic among Blaberoidea.
Accordingly, it appears that Blaberoidea are preadapted to the advanced rotation and oviparity B (consequently ovoviviparity), but as far as known, these two features only co-occur in Blaberidae and Blattellinae. Blaberidae and Blattellinae were recovered as sister groups (
Other characters preserved in the fossils of P. yixianensis are barely discernible except the wing venation. The forewing of P. yixianensis conforms to the general form of Blattellinae (see
So far, the evolution of reproduction type, ootheca handling behaviour and wing venation of cockroaches is not well understood, and might be more complicated than currently known. In view of this, the phylogenetic position of P. yixianensis is not securely settled. Nevertheless, P. yixianensis can be tentatively considered as a member of Blattellinae, and thus calibrates the node of Blattellinae + sister (Blaberidae herein). In summary, P. yixianensis as a calibration should be used with caution, and comparative analyses with/without this fossil should be performed to accommodate its uncertainty.
Criterion 3. Reconciliation between molecular and morphological phylogenies is partially achieved. As mentioned above, regardless of the Nyctiborinae that is not included in the final data, the sistergroup relationship between Blattellinae and Blaberidae is supported herein and in recent big data analyses (
Criterion 5. Piniblattella yixianensis is from Huangbanjigou, Beipiao, Liaoning, northeastern China (
The other fossil for calibration, V. brenanae, has been frequently used for Xylophagodea (e.g.,
Fossil calibrations contribute considerably to the discrepancy in the age estimation among studies. The ‘roachoid’ fossils, remarkably, were frequently assigned as “stem Dictyoptera” (e.g.,
Comparison among the ages estimated in various studies. The fossils are: Valditermes brenanae Jarzembowski, 1981; Piniblattella yixianensis
Unfortunately, many of the fossil calibrations other than ‘roachoids’ are also unjustified. Subsequently, comparisons among the age estimates from various studies could be pointless. For example, the “stem Mantodea” Homocladus grandis (
A critical review of cockroach fossil calibrations was not achieved until
Cretaholocompsa montsecana was determined as a close relative of extant Holocompsa (Martinez-Delclos 1993), and used as a calibration for corydiid nodes (
“Gyna” obesa was used as a calibration for blaberid nodes (Bourguignon et al. 2017,
Ectobius kohlsi was identified based on a comparison with extant species (
Interestingly, the fossil discarded by
Only one true-cockroach fossil is used as a calibration in the present study, but this does not imply that other fossils are substandard. Every informative fossil (with high phylogenetic resolution and ascertained geological context) has the potential to be a competent calibration, but the incorporation of them is hampered by the fact that relevant living species are under-sampled or have not yet been sequenced. Noteworthy examples of fossils include those of extant genus, e.g. Supella (Nemosupella) miocenica Vršanský et al., 2011 (see
Based on published mitochondrial genomes, the present study infers a phylogeny of cockroaches and termites as Corydiinae + (((Cryptocercidae + Isoptera) + ((Anaplectidae + Lamproblattidea) + (Tryonicidae + Blattidae))) + (Pseudophyllodromiinae + (Ectobiinae + (Blattellinae + Blaberidae)))). The sistergroup relationship between (Cryptocercidae + Isoptera) and (Anaplectidae + Lamproblattidae + Tryonicidae + Blattidae) is recovered for the first time. This study suggests that the phylogenetic reconstruction of cockroaches is in urgent need of the data of Corydioidea (particularly the Nocticolidae), of which the phylogenetic relationships are poorly known. This study dates the crown Dictyoptera to early Jurassic, and crown Blattodea to middle Jurassic. Using the ambiguous ‘roachoid’ fossils to calibrate Dictyoptera+sister pushes these times back to Permian and Triassic. Given currently available data and fossils, few nodes within true cockroaches can be calibrated. This can be overcome by discovering more fossils, or by sampling fossil-related species to allow the incorporation of well-justified fossils. In view of the scarcity of suitable fossils for calibration, the latter approach may be more promising.
I deeply thank Dr Klaus Klass and Dr Dominic Evangelista for constructive comments and critiques, and thank the authors of mitogenome data which are fundamental to the present study. This study was motivated by my interest and not funded.
Initial pool of 169 mitochondrial genomes
Data type: document/list (pdf. file)
Explanation note: Initial pool of 169 mitochondrial genomes found in GenBank.
Table S1
Data type: Table (pdf. file)
Explanation note: Metadata of the 95 selected mitochondrial genomes.
Table S2
Data type: Table (pdf. file)
Explanation note: Sequence correction record.
Figure S1
Data type: statistic plot (jpeg. image)
Explanation note: Pairwise similarity scores calculated by ALIGROOVE.
Figure S2
Data type: statistic plot (tiff. image)
Explanation note: Substitution saturation plots per codon position per gene.
Figure S3
Data type: phylogram (tiff. image)
Explanation note: Bayesian phylogeny of 13 protein-coding genes of 95 species, excluding the third base of codon.
Figue S4
Data type: phylogram (tiff. image)
Explanation note: Bayesian phylogeny of 10 protein-coding genes of 95 species, excluding the third base of codon.
Figure S5
Data type: phylogram (tiff. image)
Explanation note: Bayesian phylogeny of 10 protein-coding genes of 87 species (Good-species analysis), excluding the third base of codon.
Figure S6
Data type: phylogram (tiff. image)
Explanation note: Bayesian phylogeny of 10 protein-coding genes of 92 species (Short-species analysis), excluding the third base of codon.
Figure S7
Data type: phylogram (tiff. image)
Explanation note: Bayesian phylogeny of 10 protein-coding genes of 85 species (Safe-species analysis), excluding the third base of codon.
Figure S8
Data type: time tree (tiff. image)
Explanation note: Time tree estimated by MCMCTREE with three fossil calibrations (incl. Qilianiblatta namurensis).
Figure S9
Data type: time tree (tiff. image)
Explanation note: Time tree estimated by MCMCTREE with two fossil calibrations (excl. Qilianiblatta namurensis).
Figure S10
Data type: time tree (tiff. image)
Explanation note: Time tree estimated by MCMCTREE with only one fossil calibration (Valditermes brenanae).