Ancient human genome sequence of an extinct Palaeo-Eskimo,
Nature 463, 757-762 (11 February 2010) | doi:10.1038/nature08835; Received 30 November 2009; Accepted 18 January 2010
We report here the genome sequence of an ancient human. Obtained from ~4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20×, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit.
In 2008 we used permafrost-preserved hair from one of the earliest individuals that settled in the New World Arctic (northern Alaska, Canada and Greenland) belonging to the Saqqaq Culture (a component of the Arctic Small Tool tradition; approximately 4,750–2,500 14C years before present (yr bp))14, 15 to generate the first complete ancient human mitochondrial DNA (mtDNA) genome16. A total of 80% of the recovered DNA was human, with no evidence of modern human contaminant DNA. Thus, the specimen is an excellent candidate upon which to sequence the first ancient human nuclear genome. Although cultural artefacts from the Arctic Small Tool tradition are found many places in the New World Arctic, few human remains have been recovered. Thus, the sequencing project described here is a direct test of the extent to which ancient genomics can contribute knowledge about now-extinct cultures, from which little is known about their phenotypic traits, genetic origin and biological relationship to present-day populations.
Population genetics context of the Saqqaq individual
The origin of the Saqqaq and other Palaeo-Eskimo cultures, and their relationship to present-day populations, has been debated since they were first discovered in the 1950s34. Competing theories have attributed the origins to offshoots of the populations that gave rise to Native American populations such as the Na-Dene of North America, alternatively from the same source as the Inuit currently inhabiting the New World Arctic, or from still other sources entering the New World even later than both the Native American and Inuit ancestors (for summary see ref. 35)…
Figure 3: Population genetics and phylogenetics.
a, Locations of the studied populations are shown with the most relevant populations indicated by name (numbers in circles correspond to the nr column in Supplementary Table 12). b, PCA plot (PC1 versus PC2) of the studied populations and the Saqqaq genome. c, Ancestry proportions of the studied 492 individuals from 35 extant American and Eurasian populations and the Saqqaq individual as revealed by the ADMIXTURE program39 with K = 5. Each individual is represented by a stacked column of the five proportions, with fractions indicated on the y axis. The analysis assumes no grouping information. The samples are sorted by region/population only after the analysis. For better readability the Saqqaq individual is shown in three columns. Populations added to the published collection36 are shown in bold. Red dots in the expanded plot indicate four individuals whose ancestry proportion pattern showed the highest correlation (Kendall τ > 0.95; P < 0.05) with that of the Saqqaq individual. d, The phylogenetic tree of Y chromosome haplogroup Q. The position of the Saqqaq individual is ascertained by markers shown on the tree. Information for markers shown in parentheses is missing and their status is therefore inferred. Haplogroup names are according to ref. 38; hash symbol indicates error in reference (Supplementary Information).
Principal component analysis (PCA) was used to capture genetic variation. PC1 distinguishes west Eurasians from east Asians and Native Americans, whereas the PC2 captures differentiation between native Asians and Americans (Fig. 3b). Importantly, the PC1 versus PC2 plot shows that the Saqqaq individual falls in the vicinity of three Old World Arctic populations—Nganasans, Koryaks and Chukchis, while being more distantly related to the New World groups (Amerinds, Na-Dene and Greenland Inuit). Koryaks and Chukchis inhabit Chukotka and northern Kamchatka of the Siberian far east. Ethnography describes these groups as having a diverse subsistence economy based on terrestrial and marine hunting as well as reindeer herding. The Nganasans inhabit the Taimyr Peninsula, some 2,000 km from the Bering Strait and are the northernmost living Old World population. Although historically Nganasans have been terrestrial rather than marine hunters, Zhokov, the oldest archaeological Arctic hunting site with a significant marine component (polar bear) on the New Siberian Islands (dating back some 7,000–8,000 yr bp37), is found just east of the Nganasans’ current occupation area. In addition, our analysis of more than two hundred Y chromosome SNPs (Supplementary Information) allowed us to assign the Saqqaq individual to Y chromosome haplogroup Q1a (Fig. 3d), commonly found among Siberian and Native American populations38. The mtDNA genome shows close relatedness to Aleuts of Commander Islands (situated in the Bering Sea) and Siberian Sireniki Yuits (Asian Eskimos) as previously described16.
We explored the data using the algorithm ADMIXTURE39, which assumes a specified number of hypothetical populations (K) and provides a maximum likelihood estimate of allele frequencies for each population and admixture proportion for each individual. We investigated values of K, from K = 2 to K = 10, repeating computing 100 times for each value of K to monitor convergence (Supplementary Information). Figure 3c shows the pattern of distinct colour-coded components at K = 5. The analysis suggests that there is a significant amount of west Eurasian admixture in most of the Siberian, Greenland and North American populations. As with the other analyses, this analysis was unable to detect any west Eurasian admixture in the Saqqaq individual, in agreement with a very low level of contamination in our assembled genome. The Saqqaq individual is also practically devoid of the component distinctive to South and Central American populations (dark brown in Fig. 3c).
Thus, at K = 5, the Saqqaq genome is comprised of three ethnic influences, specifically the ones characteristic of native populations in East Asia, Siberia in particular, and the Arctic, on both sides of the Bering Strait (Fig. 3c). In this respect the populations closest to the Saqqaq are Koryaks and Chukchis. Importantly, in contrast to Saqqaq and Koryaks, modern Greenlanders carry clear evidence of admixture or shared ancestry with Amerindians. Moreover, at K = 5, the Inuit do not display genetic components of Siberians other than the ‘Beringian’ seen in Chukchis and Koryaks. The admixture results are in agreement with the PCA plots and suggest shared common ancestry of Saqqaq and modern Inuit before the movement of the former to the New World.
We additionally used a population genetic model to obtain maximum likelihood estimates of the divergence times between the Saqqaq individual and the reference populations (Supplementary Information). The population with the shortest divergence time was Chukchis, with an estimated divergence time of approximately 0.043 (±0.08) Ne generations, where Ne is the effective population size. In contrast, the estimated divergence times to the other closely related populations—Na-Dene, Koryaks and Nganasans—were 0.093, 0.11 and 0.089, respectively. The estimated divergence time to the Han Chinese, a more distantly related population, was 0.20. These estimates can be converted to estimates of years or generations, by making assumptions regarding the effective population sizes of the reference populations. The effective population sizes are in general unknown, but can be estimated from DNA sequence data, and are generally much smaller than the census sizes (Supplementary Information). We found no evidence in favour of changes in population size. Even when accounting for the uncertainty in the estimate of the mtDNA mutation rate, and possible biases related to the genotyping data, it is still unlikely that Ne > 5,000, providing a maximal divergence time between Chukchis and Saqqaqs of 175–255 generations or between 4,400 and 6,400 years. The oldest archaeological evidence of the Arctic Small Tool tradition in the New World is from Kuzitrin Lake, Alaska, dating back ~5,500 cal. yr bp14, indicating that the ancestral Saqqaq separated from their Old World relatives almost immediately before their migration into the New World.
We report the successful genome sequencing of a ~4,000-year-old human. Data authenticity is supported by: (1) the private SNP analyses that indicate contamination levels in the raw sequence data to be ≤0.8%; (2) the mtDNA and Y-chromosome DNA haplotypes fit within haplogroups typical of north-east Asia; (3) population admixture analyses do not record any European component in the Saqqaq genome; and (4) the PCA plots clearly reveal close affiliation of the Saqqaq genome to those of contemporary north-east Siberian populations. These observations, coupled with evidence of excellent DNA preservation, and sample handling being restricted to northern Europeans before incorporation of a sequence indexing, indicate that contamination in the Saqqaq genome is not of concern. Our study thus demonstrates that it is possible to sequence the genome of an ancient human to a level that allows for SNP and population analyses to take place. It also reveals that such genomic data can be used to identify important phenotypic traits of an individual from an extinct culture that left only minor morphological information behind. Additionally, the ancient genomic data prove important in addressing past demographic history by unambiguously showing close relationship between Saqqaq and Old World Arctic populations (Nganasans, Koryaks and Chukchis). A single individual may, or may not, be representative of the extinct culture that inhabited Greenland some 4,000 yr bp. Nevertheless, we may conclude that he, and perhaps the group that once crossed the Bering Strait, did this independently from the ancestors of present-day Native Americans and Inuit, and that he shares ancestry with Arctic north-east Asians, genetic structure components of which can be identified in many of the present-day people on both sides of the Bering Sea.