not invented here.

Notes on bioinformatics methods, software and discussions.

9.19.2011 Informatics of Cancer Genomes

Lincoln Stein - Ontario Institute for Cancer Research, Canada

Cancer as a genomic disease, coupled with the underlying stochastic processe: every cancer genome is different, less than ideal for a one-size-fits-all treatment regimen. Few notable targeted therapies reported so far (Herceptin for HER2 amplifications in breast cancer, Erlotinib for EGFR mutations).

International Cancer Genome Consortium (ICGC)

Huge scope, no country can do all the required data analysis alone. Allows coverage even of less frequent cancers, but requires standardiztion of QC, formats, dissemination. Data is distributed, but accessible through common portal — interpreted data through a federated database system. Only really tractable on the level of interpreted data rather than the raw reads.

50 different cancer types, 500 donors per type, sample tumor and normal tissue (blood or adjacent tissue), test for differences in the equivalent of 50.000 human genome projects.

Network analysis of genomes

Discovering patterns and mechanisms of altered genes in cancer. Difficult due to a ‘long tail’ of cancer genes, very few genes consistently involved. Mutations affect pathways which can be knocked out in multiple different ways. Using the Functional Interaction Network (part of Reactome) to analyze these dependencies. Gene lists interpreted using a simple protocol:

  • start with list of mutated genes (array, sequencing project, etc.)
  • create disease subnetworks by adding linker genes
  • apply community clustering
  • annotate resulting disease modules and use for sample classification, disease gene prediction, hypothesis generation including genes not in the original list

[Part of our standard gene list annotation system by now; highly recommend the FI modules in addition to a standard pathway or GSEA analysis!]

Example of 310 genes from just 5 ovarian cancer patients with non-synonymous mutations in the cancer specimens. Emerging patterns provide links to KRAS, Wnt/Cadherin, MAPK, Hedgehog and other known cancer-related pathways. Can be enriched [and potentialy stabilized] as sample size increases.

Can also be used to derive prognostic signatures by running PCA on the gene expression profiles of relevant modules only rather than the gene level. Example uses a re-analysis of survival curves in public breast cancer GEO data sets (within the same study). Transfers well to four independent breast cancer series. One of the key benefits: the signature has a meaning (here: kinetochore and aurora-b signalling, hallmarks of rapidly dividing cells).

Targeted and genome-wide sequencing programs (TGWS)

Apply genomics in a clinical setting at OICR. Clinical trials of targeted agents been inefficient so far; multiple geno-typing procedures to identify suitable trials. A more rational process checks (all) mutations of patients who failed standard chemotherapy prior to enrollment in relevant clinical trials. Currently limited to a targeted sequencing of ~20 genes with existing trials or prognotic value / clinical implications.

Demonstrate feasibility and optimize process to get usable results back to patients and clinic quickly (three weeks or less). Using single molecule sequencing (PacBio) due to the short turnaround time. Created a workflow to report raw gene sequence data into usable clinical report, using a knowledgebase of common mutations and their clinical consequences. Includes information for ~200 genes obtained from local oncologists, ~800 from MSKCC, COSMIC and other sources. Used to generate a preliminary report and an expert panel that meets weekly, reviews report and requests changes or makes suggestions. Reports go back into the knowledgebase.

Draft physician report generated by the workflow system, combined with an online tracking system that handles patient consent, samples, genomic files and interfaces with a standard clinical trial informatics system. Eventually open source so the framework can be re-used at other sites.

Conference ✳ beyond the genome ✳ Cancer