Sunday, June 21, 2009

Cancer Genome Sequencing--A (Pessimistic) Interim Analysis

The current issue of Cancer Research carries a very brief (3 pages, with one page mostly tables & figures) review of the first pulse of cancer genome sequencing papers (sub required to read article). While sub-titled 'An Interim Analysis', perhaps a better subtitle would be 'A Uniformly Negative Analysis'.

A full-press cancer genomics project has been a controversial drive, with many bemoaning the huge amount of resources devoted it and believing other avenues would be better suited for enhancing our ability to help cancer patients. But it has gone forward, and a spate of papers over the last year have reported the early results.

The initial papers have covered 4 of the big cancers in terms of incidence and mortality (lung, breast, colorectal and pancreatic) as well as glioblastoma and leukemia. Different studies have taken different tacks. In leukemia, we have the first parallel complete sequencing of a patient and their tumor. Papers in breast, colorectal (together covered in two papers here and here), pancreatic and glioblastoma looked at huge numbers of coding exons in small numbers of patients (11 patients x 18.2Kgenes for breast and colorectal; 21 patients x 20.6Kgenes for glioblastoma; 24 patients x 20.6Kgenes for pancreatic). A lung paper and the other glioblastoma paper looked at ~600 genes, but in larger numbers of patients (188 in lung and 91 in glioblastoma).

Personally, I would take a more nuanced view of the results. I think it is hard to argue that these papers have had a shortage of fireworks there have been some important observations made, which curiously the Cancer Research review ignore completely. In the lung study (which I have studied the closest) these include important exclusion and cooperativity relationships between mutations and a number of novel, druggable candidate driver genes (protein kinases) not previously suspected in lung cancer. In the many genes few patients glioblastoma study, it was the identificaiton of a mutational hotspot in isocitrate dehydrogenase 1 (later found to be present, though less frequently mutated, in isocitrate dehydrogenase 2).

Of course, one thing which is changing rapidly is the cost of doing these studies. Most of these papers used conventional PCR amplification and Sanger sequencing, which I would lowball estimate at $1/well (very lowball, but Sandra Porter caught some serious flak suggesting [as I have] a number much higher than this for the sequencing part, and I don't have the accounting experience to argue -- but I do know people who calculated it at Codon and this would be a very low estimate) -- so those studies looking at nearly every coding exon were at least a quarter million per patient (those 20+K genes explode out to about a quarter million exons). Clearly this isn't how things will tend to be done going forward; Illumina will now blow away genomes for $48K each and other companies are now quoting even lower. This is still well in excess of the per patient estimate for the very focused studies, and I believe these (particularly the lung study) demonstrate the value of lots of patients, since this started to give the numbers required to look at interactions between mutations.

One of the reasons the Cancer Research authors aren't terribly pleased with the progress is clear: they feel the experiments aren't the correct ones. But whereas some of the flak I had seen directed at the cancer genome sequence concept was instead promoting more functional approaches (such as RNAi library screening), what these authors want (or at least set as the minimum bar of for interesting) is cancer genome screening on an almost monomaniacal scale: thousands if not millions of individual cells from the same tumor! Clearly this would be fascinating, as there is plenty of evidence that tumors are a motley collection of genetically variant cells (but clonal -- all the tumor cells have the same ancestor, but they also are all sloppy DNA copyists). And, as they note, no DNA sequencing technology here now or on the immediate horizon has any shot at a project of this scale.

While I do believe this would be interesting, I'm not as certain it would be informative for patient care. Since many of these mutations are under very little selection, the spectrum of observed mutations is likely to be enormous. Given that there is already a horrendous backlog of characterizing mutations seen in the studies to date (though there has been a paper already functionally characterizing the isocitrate dehydrogenase mutations)

What is particularly strange about this view is that a more reasonable intermediate step would be to look at those cells that do escape the primary tumor (most of the cancer genome papers so far have focused on primary tumors, though the IDH mutations are primarily found in secondary glioblastomas) -- sequence the metastases. Ideally, this would mean finding multiple patients willing to consent to their genome, their primary's genome, and multiple metastases' genomes being sequenced -- the latter quite likely coming from autopsies (otherwise it is a lot of painful biopsying without much hope of helping the patient, an ethically questionable activity). Or, in leukemias one could more easily resequence after each relapse. Such studies would be doable technically and not cost ridiculous (though clearly not chump change either).

There's also the open question as to whether the real fireworks will come from sequencing less studied cancers, such as the recent success in using transcriptome sequencing to identify the probable causative mutation in a rare type of ovarian cancer (see also the News and Views piece). Perhaps we've mined the rich ore out of some of these veins, and it is the less worked seams which will yield fine genomic insights. Time will tell.

No comments: