Background
Barley yellow dwarf virus (BYDV) is a viral disease that has a serious effect on grain crops worldwide; these viral infections may reduce wheat yields by an average of 11–33% and sometimes by up to 80% [
1,
2]. In the 1890s, BYDV was widespread in the American Midwest [
3]. Initially, the disease was restricted solely to oat crops in the midwestern and eastern United States, but it subsequently became widely distributed in the United States and was discovered to affect wheat, rye, barley, and other cereal species [
4‐
6].
In a seminal paper on the ecological study of Barley yellow dwarf (BYD), Oswald and Houston identified BYDV as a new positive-sense ssRNA virus that is persistent and cyclically transmitted by aphids as the pathogenic agent of BYDV [
5,
7]. BYDV is a member of the genus
Luteovirus in the family Luteoviridae [
8,
9]; current data do not show that it can be transmitted mechanically or via seeds [
6]. Although at least 25 aphid species have been reported as BYDV carriers [
10‐
12], each virus displays a high degree of vector specificity among different aphid species. According to the International Committee on Taxonomy of Viruses (ICTV), BYDVs are divided into seven different or unassigned genera (BYDV-PAV, BYDV-MAV, BYDV-PAS, BYDV-KerII, BYDV-KerIII, BYDV-SGV and BYDV-GPV) in the family Luteoviridae [
13]. Although, five of these species have been classified as strains of BYDV in the
Luteovirus genus (BYDV-PAV, BYDV-MAV, BYDV-PAS, BYDV-KerII and BYDV-KerIII) on the basis of genetic structure and serological and evolutionary relationships [
12,
14,
15], but some reports believe that they are five distinct species, not subspecies of BYDV.
Rhopalosiphum padi and
Sitobion avenae efficiently transmit the most common virus, BYDV-PAV, BYDV-MAV, BYDV-SGV, and BYDV-Ker (KerII, KerIII) were found to be transmitted most efficiently by
Si. avenae,
Schizaphis graminum, and
R. padi, respectively [
15,
16]. BYDV-GAV can be effectively spread by
Si. avenae and
Sc. graminum and is considered a subspecies of the barley yellow dwarf virus MAV [
17]. BYDV-GPV is a unique and widespread strain in China that shows no serological relationship with American strains. It is transmitted by
R. padi and
Sc. graminum [
12].
The genome of BYDV is approximately 5700 nt, and different strains exhibit different genome sizes [
12,
18]. The genome harbours six open reading frames (ORFs). ORF2 is solely expressed fused to ORF1 via low-frequency − 1 ribosomal frameshifting in the overlapping region to encode the RNA-dependent RNA polymerase (RdRp) [
19]. ORF3 and ORF4 encode the virion assembly protein (coat protein, CP) and cell-to-cell movement protein (movement protein, MP), respectively. ORF5 is fused to CP in a readthrough domain (RTD), which is necessary for transmission via aphids. The functionality of ORF6 near the 3’ may encode viral suppressors of RNA silencing [
9,
20].
As researchers have increasingly studied BYDV, we have obtained a deeper understanding of the evolutionary pattern and genetic characteristics of this virus. BYDV-PAV is the most influential genus of BYDV, and the descriptions of several species or subspecies within BYDV-PAV, including BYDV-PAV-I, BYDV-PAV-II (formerly BYDV-PAS) and PAV-IIIa/IIIb, differ as a result of widespread recombination events [
21]. More importantly, the results of Bayesian evolutionary analysis show that the mutation of BYDV-PAV may arise from geographic, vector insect and host adaptation and that the evolutionary rate of BYDV-PAV under the action of purifying selection is similar to that of other RNA viruses [
22,
23]. These reports have provided us with a deeper understanding of the virus, and the complicated evolutionary mechanism of BYDV has important implications for controlling the effects of the virus in agricultural production.
Unexpectedly, we did not identify the BYDV-GAV strain according to the BYDV classification standard of the ICTV. In addition, one study has surprisingly shown that the BYDV population responsible for the epidemic on the Kerguelen Islands, in the absence of carrier aphids, includes BYDV-KerII and BYDV-KerIII strains [
6]. In fact, an inherent characteristic of the virus transmitted by aphids is that it has difficulties effectively spreading across geographic barriers. Nevertheless, an increasing number of reports have confirmed that the dispersal patterns of viruses may be associated with multiple human-mediated factors [
14,
24,
25].
RNA viruses exhibit a high mutation rate, rapid replication dynamics, and large virus populations; at the same time, due to the influence of genetic drift, gene flow and natural selection, the evolutionary characteristics and population genetic structure of viruses tend to become more complicated [
26]. Moreover, BYDV is restricted by geographical barriers. It is necessary to study the geographic range, epidemiological routes and possible evolutionary mechanisms of BYDV. However, knowledge of the evolutionary biology of BYDV, particularly at a transnational scale, is relatively limited compared to that of other important plant viruses, such as potato virus Y (PVY) and turnip mosaic potyvirus (TuMV) [
24,
27,
28]. Therefore, we wanted to give more attention to the evolutionary and genetic characteristics of the BYDV strains and their population histories beyond those of BYDV-PAV alone. Additionally, we put forward some suggestions regarding the classification status of BYDV-GAV.
Discussion
We used the data obtained after screening all complete BYDV CP and MP sequences retrieved from GenBank and our new sequences to conduct a large-scale system dynamics analysis of the global population of BYDV. At present, BYDV-GAV, BYDV-GPV and BYDV-SGV are not classified under BYDV according to searches for the virus in the ICTV database, and there are differing opinions regarding the division of different BYDV strains [
12,
16,
47,
48]. We also obtained results regarding the ownership of BYDV-GAV and BYDV-GPV. For both the cp and mp genes, the phylogenetic trees reconstructed by Bayesian and maximum likelihood methods showed that BYDV-GPV was located on one branch, completely separated from BYDV-PAV, BYDV-MAV, BYDV-PAS, BYDV-GAV and BYDV-SGV (Supplementary Fig.
1). In contrast, BYDV-PAV, BYDV-PAS, BYDV-MAV and BYDV-GAV were clustered into one branch other than BYDV-SGV, BYDV-GPV and the outgroups (Supplementary Fig.
1), and were further clustered into two different branches (Fig.
1, Supplementary Fig.
1). This seems to suggest that BYDV-PAV, BYDV-PAS, BYDV-MAV, and BYDV-GAV should be one species. A previous report showed that the lowest identity of the BYDV-PAV cp gene with BYDV-MAV was 73%, and the highest identity with BYDV-GPV was 61% [
49]. Another report suggested that the four strains of BYDV-PAV, -PAS, GAV, and -MAV had close identity, while they had distant identity to BYDV-RMV [
14]. Furthermore, some reports have shown that BYDV-PAV, BYDV-PAS, BYDV-MAV, and BYDV-GAV congregate in the same evolutionary branch outside the outgroups [
14,
16], which is consistent with our results. Based on our results and previous reports [
12,
14‐
18,
49], our suggestion is that BYDV-GAV, BYDV-PAV, BYDV-PAS and BYDV-MAV should be grouped into four strains of BYDV.
Genetic drift, gene flow and natural selection affect the mutation of viruses and determine the genetic structure of species populations [
50]. Genetic recombination is also an important evolutionary phenomenon of plant viruses [
51]. Previous studies have shown that recombination and natural selection are important driving forces for the evolution and differentiation of BYDV [
22,
52,
53]. Our maximum-clade credibility trees show that BYDV initially diverged into two different evolutionary lineages (Fig.
1), and this divergence seems to be caused by different aphids that spread the virus. BYDV infects aphids in a cyclical and sustainable manner, and geminivirus infects aphids, whiteflies and other vectors in a cyclical and sustainable manner. The geminivirus coat protein protects the genome in the vector´s alimentary and circulatory systems [
54]. Existing reports suggest that the CP and CP-RTP proteins confer highly specific aphid transmission properties [
55]. Therefore, we deduce that in the process of BYDV evolution, to better expand the population, the virus had an inseparable relationship with its vector insects, and thus evolved two different strains: BYDV-PAV and -MAV. Biological and abiotic factors associated with pathogens can individually and interactively affect the extent of genetic drift, gene flow, and selection, thereby influencing the generation and maintenance of spatial population structure [
56]. Directional selection of pathogen biology, physical environments, and the ways of human intervention methods during and after agricultural production can drive the rapid accumulation of adaptive genetic differentiation in plant pathogen populations [
57].
The genetic variability of many viruses is related to the geographical origin of the viral isolates [
27,
58,
59]. As expected, in our study the evolution of BYDV was found to be strongly related to geography (Table 1, Supplementary Fig.
4, Supplementary Fig.
5). Since different regions have different external environments, including varying altitudes and meteorological conditions, this relationship may be a result of evolutionary adaptation of the virus being driven by geographic location. This environmental factor may have led to the specific evolution of the BYDV-MAV strain into the BYDV-GAV strain in China and led it to remain there for a long time. Furthermore, the transmission of this virus by aphids is restricted by the geographical barriers that exist between different regions. In China, the dominant species of wheat aphid mainly include
R. padi,
Sc. graminum and
Si. Avenae (Ministry of Agriculture and Rural Affairs of China), which can effectively transmit BYDV-PAV and -GAV. However, no BYDV-GAV strain has yet been found in Sichuan Province. The dominant species of wheat aphid in Sichuan Province is
R. padi (Sichuan Academy of Agricultural Sciences), which can effectively transmit BYDV-PAV but not BYDV-GAV. Another important point is that the major wheat producing areas in Sichuan Province are located in the Chengdu Plain, which is surrounded by a natural barrier to aphid migration. Host-driven adaptation could affect the diversification of viral isolates [
27,
60]. It has been reported that the diversity of the BYDV-PAV population may be related to geographic adaptations as well as by host-driven adaptation [
52]. As expected, our results also suggest that genetic diversity in BYDV is associated with the host, but that different associations are expressed by different genes (Table 1, Supplementary Fig.
4B, Supplementary Fig.
5B). This result may be related to the function of proteins translated by different genes. Coat protein is one of the important components of virions, and it also determines the high specificity of aphid transmission [
55]. Movement proteins contribute to disease symptoms and facilitate intra-and intercellular movement [
61]. Although ORF4 (mp gene) overlaps in its entirety with ORF3 (cp gene), this difference in the function of the encoded proteins was bound to lead to a differentiated evolutionary outcome in the long-term interaction between the virus and the host plant.
A high mutation rate is one of the characteristics of RNA viruses [
62], and the evolutionary rate of plant RNA viruses is generally on the order of 10
− 4 subs/site/year [
23,
27,
63]. Previous reports indicate that the evolutionary rate of the cp of the
Luteoviridae family is 4.3 × 10
− 4 subs/site/year, the evolutionary rate of the cp of barley yellow dwarf virus is 1.5 × 10
− 3 subs/site/year [
64], and the average evolutionary rate of the BYDV-PAV genome is 3.158 × 10
− 4 subs/site/year [
52]. The results obtained in our study by using the Bayesian method after molecular clock calibration with recombination-free sequences are quite different from the above results. However, it has been reported that different genes from the same virus may evolve at different rates [
27,
46]. Our results show that the evolutionary rates of the cp and mp genes of BYDV are basically similar, at 8.327 × 10
− 4 subs/site/year (95% credibility interval, 4.700 × 10
− 4–1.228 × 10
− 3) and 8.671 × 10
− 4 (95% credibility interval, 6.143 × 10
− 4–1.130 × 10
− 3), respectively. This result is not surprising as ORF4 (mp gene) overlaps in its entirety with ORF3 (cp gene).
One of the difficulties in estimating the dates of MRCAs is that plant RNA viruses include many recombinants, so a larger sequence sample is required to reliably estimate these dates [
46]. Different datasets will lead to different MRCA results, and different genes will produce different results. Previous reports have suggested that the MRCA of BYDV-PAV was estimated to be between 268 and 4680 years ago [
52], and another report showed that the earliest common ancestor of BYDV was in the range of 13-2009 years ago (with RdRp giving more recent dates and RTD producing earlier dates, [
64]). Notably, Wu et al. (2011) did not test the temporal signal of the dataset or filter the best fit before analysing the BYDV MRCA data, which the cause of the discrepancy. In particular, misspecification of the tree prior could result in incorrect substitution rates and inaccurate MRCA data [
38]. We performed molecular clock calibration on our dataset in a more rigorous way and molecularly dated different datasets with recombination-free cp and mp genes using Bayesian and maximum likelihood methods (Supplementary Table
3). Our results show that the MRCA of BYDV calculated using the cp and mp genes is 1444 CE (95% credibility interval: 1040–1766 CE) and 1742 CE (95% credibility interval: 1577 CE–1883 CE), respectively.
Our results suggest that there is a high probability that BYDV originated in the United States, and the root posterior probabilities for the USA are much higher than those in other regions, whether using the six datasets of the CP or the MP genes (Fig.
1, Supplementary Table
4). Not surprisingly, BYDV was first discovered and reported in the United States, and samples from infected hosts of BYDV were observed in the United States over 100 years ago [
3,
7]. An increasing number of reports have shown that multiple human-associated factors can spread the virus across geographic barriers [
14,
25,
28,
46,
65]. We reconstructed the migration pathways of BYDV on a global scale and identified multiple migration pathways of BYDV from the USA and China to other regions, indicating that the USA and China have been important hubs for the global spread of this pathogen (Fig.
2, Supplementary Table
5). BYDV has reportedly spread over long distances through maritime trade between Australia and the United States [
65]. A recent report also confirmed our result of a migration pathway from Northwest China to Estonia [
17]. On the basis of the inference of the geographical origin of the virus and its global migration path, we are more convinced that BYDV originated in the United States. After spreading from the United States to South America, Asia, Australia and Europe, it further spread from China to South Korea, Estonia and Germany. In this way, barley yellow dwarf virus spread around the world, and the population size expanded dramatically. Under different selective pressures from different management patterns, different environments, different resistant varieties, and different farm chemicals, the evolutionary pattern of the barley yellow dwarf virus was specialized in relation to the region and host.
With ongoing research on the virus, we have obtained a profound understanding of how to prevent the virus from causing crop diseases [
6]. We performed a molecular evolutionary analysis of BYDV isolates based on two genes, and our findings have provided new insights into the evolutionary history of BYDV. We suggest that BYDV-PAV, BYDV-PAS, BYDV-MAV and BYDV-GAV should be classified as one species named BYDV. Whether BYDV-KerII, BYDV-KerIII and BYDV-SGV belong to BYDV as subspecies has not yet been concluded. BYDV-KerII and BYDV-KerIII isolates lacked nonrecombinant complete cp and mp gene sequences, and our phylogenetic analysis suggests that BYDV-SGV may belong to a new species distinct from BYDV. The evolution of BYDV is related to geography and to its aphid transmission vectors, and it may also be related to the adaptation of its infected hosts. Through pedigree and geographic analysis, we found that BYDV probably originated in the United States and spread to other regions, and that China was the main export region for BYDV. Surprisingly, the population size of this virus expanded dramatically across the globe less than 8 years into the 21st century, followed by a sharp decline less than 15 years later, which is largely related to in-depth scientific research and is consistent with our field investigations (unpublished). There is no doubt that when a new disease causes large-scale losses in agricultural production, effective chemical pesticides, rational agricultural management measures and resistant varieties are usually used in agricultural practices to control the impact of the pathogen. Although we performed a more comprehensive analysis of the population history and evolutionary characteristics of BYDV using numerous isolates of the cp and mp genes, we may need variant information on ORF1 and ORF2 for further evaluation of BYDV evolutionary characteristics and population history.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.