Selected one of Popular Sciences 50 Best of the Web.
Get Email Updates
Write to us and we will send you an email when a new feature appears on the site.
ENTER THE GENOME: Genome ABCs
February 10, 2001
Last June, world leaders and scientists celebrated the sequencing of the human genome, the book of operating instructions for who we are.
Since then, researchers from around the world have actually begun to read this mysterious text. Their findings, published this week in a special edition of the journal Nature and also in Science, reveal some major surprises about the complexity of our genome, the number of genes in it, and the role of junk DNA.
Deciphering the text
It was heady stuff when scientists announced last year that they had deciphered the three billion letters that make up the human genome. But even though they generally knew what was in the so-called "book of life," they hadnt yet deciphered it. Now, by scouring the entire genome for patterns, theyre finding that its not exactly what they expected.
"The genome is a text, its basically just a long novel, written in four letters instead of 26 English letters," explains Eric Lander, director of the Whitehead Center for Genome Research. "What weve been doing over the past six months is, many different experts have been reading the text with their own particular interest in mind and weve been collecting all these glossaries, all these commentaries on this common text, and its really a remarkable and exciting picture."
All living things, from bacteria to plants to animals, share the same four-letter alphabet, but it was always thought that humans must use it to make many more genes. It turns out, however, that instead of having from 80,000 to 150,000 genes, as some experts predicted, there are only about 31,000 genes in the human genome. "You could say we thought we were War and Peace, but we turned out to be a much less complicated book after all," says Francis Collins, director of the National Human Genome Research Institute.
This is particularly surprising when you look at other genomes. Worms have 18,000 genes, flies have 13,000 and even weeds have 26,000 genes. "Life seems to have used the same complement of parts to make different things," says Carina Dennis , senior editor (genetics) of Nature. "And you see that right down to the protein level. Humans are using the same building blocks as plants. They are just rearranging the parts or increasing the number of parts."
Yet with roughly the same number of genes as other animals and plants, we are able to perform vastly different functions. "The complexity that we have doesnt come down to our gene number, it comes down to the intricacies and complexities of the kinds of biological processes going on in our cells," says Dennis.
Unlike plants and animals, our genes can multi-task by splicing together in different ways to do different things. Dissecting the genome over the last few months has shown researchers just how versatile our genes are. "It looks as if on average, a human gene can code for about three different proteins," says Collins. "A worm doesnt have nearly that capacity for using one gene for several purposes."
The ultimate map
The relatively small number of human genes is also surprising because theyre spread over such a large genome. "One of the surprising things about the human genome is that its hard to find the genes," says Lander. "The reason is the genes actually make up a remarkably small fraction of our genome. In bacteria, half or more of the genome is composed of genes. In humans, its one or two percent thats made up of genes. The actual information that codes for all the proteins in the body is scattered about amongst vast tracts of other stuff the information is in little snippets of a few hundred letters spaced apart by tens of thousands of bases of other stuff. Its in effect a problem of signal to noise."
When you compare these worms to humans, youll find that their genomes are not so different after all.
To go to the "Enter the Genome: Audio/Video Index", click here.
As they map the genome, researchers have been tracking markers on select pieces of the sequence to help them navigate along the way. "The map provides a series of landmarks across the genome where you can literally walk through it and visit a marker thats very well spaced from the next marker," says Dennis. "Its the difference between trying to find something in North America with a detailed map down to the level of a few feet and trying to find something in North America landing on the coast and having to search up and down the continent," says Lander.
Although researchers once thought the wasteland of bases between genes known as junk DNA served no purpose, they now think differently. At least some of it may have a function, particularly some pieces of junk located near genes. "It turns out over the course of hundreds of millions of years these sequences have selectively appeared in the places where the genes are most densely packed and that only makes sense if they are serving some useful function," says Collins. "That is a radical concept that most people could not imagine being correct, but it seems inescapable thats what its telling us."
"Its like saying that suddenly we realize the junk is maybe the engine of the car," explains Dennis. "Before we thought it was just the doors and the seats that are important but then someone lifted up the hood and saw that actually the engine is pretty important as well. And maybe thats what the junk DNA is equivalent to."
Different ways to reach the same conclusion
Like the draft sequence of the genome announced in June, the research published in the two journals represents both a public and a private effort to map the genome.
The consortium of 20 institutions participating in the International Human Genome Project, of which the Whitehead Institute is a member, used a method known as hierarchical shotgun sequencing to map the genome. With this method, several copies of the genome are broken down into pieces that are sequenced and then put back together.
"One way to think about the strategy is if you had a copy of the New York Times and you were asked to shred it into little pieces and assemble it again," explains Lauren Linton, associate director of the Center for Genome Research. "The best way to do that, first of all, would be to start with 10 copies that you would shred up separatelythe idea being that youre going to need overlapping fragments so that you could piece it back together again." But putting together phrases without knowing their context is extremely difficult. If you shred the sections separately, however, the problem becomes more manageable. If you take it further still and tear up one page at a time or even one fragment of a page, it becomes even easier, Linton says.
The other approach, led by Celera Genomics, used the whole genome shotgun approach, which Linton says is more akin to shredding up the entire New York Times in one fell swoop and putting it back together again.
Despite these different approaches, the two groups came to very similar conclusions. "I saw it as two alternative scientific approaches being pursued, interestingly one in academia and one in the commercial sector," says Linton. "I actually as a scientist thought it was a rich opportunity to test two hypotheses in parallel."
The International Human Genome Project continually made its data publicly available over the Internet for anyone to use, including Celera.
image: National Institutes of Health
Besides agreeing on the approximate number of human genes, both groups concluded that humans are remarkably similar to each other genetically, sharing 99.9 percent of the same genetic code. "Any two people on this planet are much more closely related, five times more closely related at the DNA level, than any two chimpanzees on this planet," says Lander. "Were actually a very little species and I think its probably a good thing for us to realize just how close we are as cousins."
Even though the two groups have made great strides in understanding the genome, there is much work to be done. "Every time we picked up a rock and turned it over and looked under it there was more under it than we ever imagined," says Lander.
The sequencing paper published in Nature by Lander and his many colleagues ends with the sentence, "it has not escaped our notice that the more we learn about the human genome, the more there is to explore." This is a reference to a sentence that appeared in Watson and Cricks 1953 paper revealing the structure of DNA for the first time. "It was intended to be an invitation to everybody to come study," says Lander, "and therefore the key point is the more we learn, the more there is to explore."