Showdown in the DNA Corral

The completion of the human genome sequence, announced last week, promises to usher in a new age of biology. In these FAQs, we discuss how the genome promises drugs targeted to individual DNA; early warnings of diseases our genes put us at risk for; a deeper understanding of human evolution...

You mean the public consortium, financed by the U.S. government and Britain's Wellcome Trust, or Celera, which announced in 1998 that it would reach the finish line first? But both sides denied it was a race.

Well, the public project had sequenced less than 5 percent of the genome when Celera leaped in. Celera's J. Craig Venter said he'd finish in less than three years, which lit a fire under his rivals. But, most important, Venter also said he would do it through a faster, cheaper method called whole-genome shotgun sequencing. His claim last week to have done so "is completely false," says John McPherson of Washington University, a leader of the public project. It looks like Celera could not assemble the full genome sequence through shotgunning alone. Instead, it had to use the public consortium's "maps," which Celera had derided as wasteful, expensive and slow. Says one scientist in the public project, "The shotgun did not work." And since Celera used public-project data, to critics it seems as if the company crossed the finish line only because it attached a tow rope to the race car in front of it.

What a sense of humor you have. Venter says he "shouldn't even dignify [the criticism] with a response." But he does. The shotgun method "works spectacularly," he says, "increasing sequencing speeds by an order of magnitude." He concedes, though, that "we used all the information we could find"--meaning data from the public project--"to validate our construction and align the pieces [of genome sequence] on the right chromosomes."

Because, innocent friend, the public project's scientists, led by Francis Collins, feel they "had to put up with three years of this crap, being told that Celera was going to do it faster and better," as one said. Celera, on the other hand, feels "our critics will never be happy as long as anything we're doing is successful," Venter says.

It's not that complicated. Celera shreds the entire genome--all 3.2 billion chemical letters. Think of it as shredding a 23-volume encyclopedia. Sequencing machines determine the order of the chemicals (denoted A, T, C or G) in each fragment, but the machine can't sequence anything much longer than 500 letters. So you sequence one fragment of 500 or so, then another and another. The result is what Eric Lander of the Whitehead Institute calls "tossed genome salad: you don't know what order [each batch of 500 letters] goes in." It's like knowing the letters in the words in isolated sentences of your encyclopedia, but having no idea what order the pages go in. Venter thought his computers and algorithms would assemble the fragments correctly. But Celera had to resort to the public project's maps for that.

It's sort of a way to keep track of which page of the encyclopedia your letters came from. Instead of having one pile of millions of shredded fragments, you have many piles, with fewer fragments. That makes assembling it all in the correct order easier. So, to oversimplify, if you see a fragment that ends with "many-lettersATTGCTTTGG," and another that begins with "ATTGCTTTGGmoreletters," they probably overlap. The assembled stretch is "manylettersATTGCTTTGGmoreletters."

So why doesn't that work with the zillions of fragments that the shotgun gives you?

Because the human genome is about 50 percent repetitive. If you have hundreds of those ATTGCTTTGGs, it's tough to figure what overlaps with what. "So many parts of the genome look exactly alike, there's a real risk of sticking wrong parts together," says McPherson. Celera may have the last laugh, though: in just three days, more than 1 million users accessed its genome assembly. Although some looks are free, other uses cost upwards of $15,000 a year.