Recent Spread of a Y-Chromosomal Lineage in Northern China and Mongolia
Yali Xue Tatiana Zerjal Weidong BaoSuling ZhuSi-Keun LimQunfang ShuJiujin XuRuofu DuSongbin FuPu LiHuanming YangChris Tyler-Smith
The American Journal of Human Genetics
December 2005, Vol.77(6):1112–1116
We have identified a Y-chromosomal lineage that is unusually frequent in northeastern China and Mongolia, in which a haplotype cluster defined by 15 Y short tandem repeats was carried by ∼3.3% of the males sampled from East Asia. The most recent common ancestor of this lineage lived 590 ± 340 years ago (mean ± SD), and it was detected in Mongolians and six Chinese minority populations. We suggest that the lineage was spread by Qing Dynasty (1644–1912) nobility, who were a privileged elite sharing patrilineal descent from Giocangga (died 1582), the grandfather of Manchu leader Nurhaci, and whose documented members formed ∼0.4% of the minority population by the end of the dynasty.
We then wished to know whether the haplotype sharing between populations and the rapid expansion were characteristics of the populations as a whole, resulting from the general demography of the area, or were specific to the Manchu cluster. The proportion of 15-STR haplotypes (excluding the Manchu cluster) shared between populations and the ρ distance (the number of steps between a 15-element haplotype in one population and the closest haplotype in a second population, averaged over all haplotypes [Helgason et al. 2000]) were therefore calculated for Mongolian and northeastern Chinese populations (table 1). These measures were not significantly different between populations that carried the Manchu cluster and those that lacked it, showing that the Manchu cluster is exceptional, in that its sharing between populations is a lineage-specific phenomenon rather than a general feature of all lineages in these populations.
On a broad scale, patterns of human genetic variation reflect global events, such as the expansion out of Africa ∼50,000–70,000 years ago and Neolithic transitions <∼10,000 years ago, but sometimes they show substantial departures from these general trends because of local factors, such as drift in small populations or selection (Jobling et al. 2004). The Y chromosome is particularly sensitive to local events because of the reduced number of Y chromosomes in the population, compared with autosomes or X chromosomes, and because of the high variance in male reproductive success. Studies of Y-chromosomal variation are, therefore, well suited to detecting such effects and, indeed, reveal the effects of drift, both in populations known to have experienced it (e.g., Hedman et al. 2004) and in others (e.g., Zerjal et al. 2002). Y-chromosomal studies have also revealed evidence of selection, with one lineage now making up an estimated 0.5% of the world’s Y chromosomes, derived from a common ancestor ∼1,000 years ago and with a spread ascribed to the influence of Genghis Khan (Zerjal et al. 2003; Katoh et al. 2005). The importance of such social selection, however, remains to be established, and we have therefore examined Y-chromosomal data from East Asia for signs of further unusual lineages.
Previously, 1,003 males representing 28 populations from China, Mongolia, Korea, and Japan were typed with 16 Y STRs (Zerjal et al. 2003). The frequency distribution of 15-element haplotypes, constructed without DYS19 since duplications of this locus are common in this data set, shows that most haplotypes (621 [84%] of 736) are present in a single individual and that haplotypes shared by two or more individuals become progressively rarer, so that no haplotype is shared by nine individuals (fig. 1). Five haplotypes, however, fell outside this distribution pattern and were unexpectedly frequent in East Asian populations, having been found in 11, 13, 15, 23, and 27 individuals. The first three were found predominantly in a single population or two nearby populations and, thus, could have been spread by local drift in populations of small size. The other two, however, were more widespread. The haplotype present in 23 individuals was found in eight populations (and in more that are outside the area considered here) and is the “star cluster” described elsewhere (Zerjal et al. 2003). The most frequent haplotype in this sample, designated the “Manchu haplotype” for reasons described below, was present in 27 individuals belonging to seven populations. These two high-frequency haplotypes differ by only four steps, which raises the question of whether their prevalence might have a common cause. We show below, however, that they belong to different haplogroups and, thus, that their recent expansions must be independent. Here, we examine the distribution and likely origin of the second unusual haplotype.
The 1,003 males were tested with 45 binary markers representing the known Y-chromosomal SNP variation in this part of the world (Zerjal et al. 2003; results not shown), and all Manchu haplotype chromosomes were found to carry the derived allele of M48 and, thus, fall into haplogroup C3c. A median-joining network (Bandelt et al. 1999) of this haplogroup showed that closely related chromosomes that might share a common origin were also present, forming a “Manchu cluster” (fig. 2a; data in a tab-delimited ASCII file [online only] that can be imported into a spreadsheet). However, it was not obvious where the boundaries of this cluster lay—chromosomes zero, one, two, three, four, and five steps away from the center were present 27, 7, 3, 5, 3, and 3 times, respectively, showing a decrease in frequency but not a discontinuity. We wished to define the cluster so that we could then map its geographical distribution and estimate its time to the most recent common ancestor (TMRCA), so we needed a definition independent of geography and assumed diversity. We therefore typed the C3c chromosomes with 46 new simple Y STRs (Kayser et al. 2004). The resulting 61-STR network shows, as was expected because of the larger number of markers, considerable resolution within the Manchu cluster and population-specific substructure (fig%. 2b). We defined the Manchu cluster as the set of chromosomes linked to the modal haplotype (arrow in fig%. 2b), allowing a run of up to three empty nodes. According to this definition, the cluster contains 33 chromosomes: 27/27 at the center of the 15-element network, 5/7 one step away, 1/3 two steps away, and none of the more distant chromosomes (fig%. 2a). With this definition, we could examine the distribution and time depth of the cluster.Manchu cluster chromosomes were present in seven populations: Xibe, Outer Mongolians, Inner Mongolians, Ewenki, Oroqen, Manchu, and Hezhe (fig. 3). With the exception of the Xibe, these populations are all located in northeastern China or Mongolia, and the Xibe migrated to their present location in western China from northeastern China in 1764 (Du and Yip 1993). These findings, therefore, suggest that the cluster originated and spread locally in northeastern China/Mongolia and that this happened before the time of the Xibe migration. A more precise estimate of TMRCA (mean ± SD) of the Manchu cluster, from the program Network 18.104.22.168 (Bandelt et al. 1999), was 590 ± 340 years or 220 ± 130 years, depending on whether the inferred (Zhivotovsky et al. 2004) or observed (Kayser et al. 2000; Dupuy et al. 2004) mutation rates were used. The former rate is calibrated for time periods of ∼1,000 years, and the latter is a direct measurement over single-generation times, so the most likely TMRCA is intermediate between these values and will be referred to as ∼500 years.
We reasoned that the events leading to the spread of this lineage might have been recorded in the historical record, as well as in the genetic record. The spread must have occurred after the cluster’s TMRCA (∼500 years ago, corresponding to about a.d. 1500) and, most likely, before the Xibe migration in 1764. Notable features are the occurrence of the lineage in seven different populations but its apparent absence from the most populous Chinese ethnic group, the Han. A major historical event took place in this part of the world during this period—namely, the Manchu conquest of China and the establishment of the Qing dynasty, which ruled China from 1644 to 1912. This dynasty was founded by Nurhaci (1559–1626) and was dominated by the Qing imperial nobility, a hereditary class consisting of male-line descendants of Nurhaci’s paternal grandfather, Giocangga (died 1582), with >80,000 official members by the end of the dynasty (Elliott 2001). The nobility were highly privileged; for example, a ninth-rank noble annually received ∼11 kg of silver and 22,000 liters of rice and maintained many concubines. A central part of the Qing social system was the army, the Eight Banners, which was made up of separate Manchu, Mongolian, and Chinese (Han) Eight Banners. The nobility occupied high ranks in the Manchu Eight Banners but not in the Mongolian or Chinese Eight Banners; the Manchu Eight Banners were recruited from the Manchu, Mongolian, Daur, Oroqen, Ewenki, Xibe, and a few other populations. A social mechanism was thus established that would have led to the increase of the specific Y lineage carried by Giocangga and Nurhaci and to its spread into a limited number of populations. We suggest that this lineage was the Manchu lineage.
Our hypothesis could be tested by examining the descendants of the Qing nobility (Li 1997). Unfortunately, extensive warfare during the 20th century and the Cultural Revolution (1966–1976) led to enormous social upheaval, during which descent from the nobility was usually hidden and relevant documents were destroyed. As a result, very few well-attested descendants are known, and they were not available for testing. Thus, our hypothetical explanation remains unproven. Nevertheless, it has strong circumstantial support. The 80,000 nobility at the end of the Qing Dynasty would have represented ∼0.8% of the non-Han population of China at that time and, at this frequency, should contribute ∼5 chromosomes to our sample. The number of males carrying this chromosome would have greatly exceeded the officially recognized number of nobility, so the number 5 should be multiplied several times, and the nobility’s chromosomes should be distributed among the Manchu Eight Banner populations. The lineage should, therefore, be detectable in our data set as a widespread, high-frequency northern haplotype. The Manchu cluster is the only lineage that meets these requirements.
It is notable that the two Chinese dynasties that were established by non-Han populations, the Yuan (1279–1368, by Mongolians) and the Qing, both apparently left detectable genetic imprints on the modern Chinese Y-chromosomal pool: the star cluster and the Manchu cluster. No other Y lineages have spread on this scale in this part of the world (fig. 1), but it is likely that major expansions elsewhere, and more local events here, remain to be elucidated.