Expanding the landscape of breast cancer drivers

In comparison with a previous study (Stephens et al., 2012, shown in gray), a new computational approach that focuses on somatic copy number mutations increased the number of known driver mutations in breast tumors to a median of five for each tumor. The findings could raise the likelihood of finding actionable targets in individual patients with breast cancer.

For many years, researchers have known that somatic copy number alterations (SCNA’s) — insertions, deletions, duplications, and transpositions of sections of DNA that are not inherited but occur after birth — play important roles in causing many types of cancer. Indeed, most recurrent drivers of epithelial tumors are copy number alterations, with some found in up to 40% of patients with specific tumor types. However, because SCNA’s occur when entire sections of chromosomes become damaged, biologists have had difficulty developing effective methods for distinguishing genes within SCNA’s that actually drive cancer from those genes that might lie near a driver but do not themselves cause disease.

Helios nearly doubled the number of high-confidence predictions of breast cancer drivers.

In a new paper published in Cell, researchers in the laboratories of Dana Pe’er (Columbia University Departments of Systems Biology and Biological Sciences) and Jose Silva (Icahn School of Medicine at Mount Sinai) report on a new computational algorithm that promises to dramatically improve researchers’ ability to identify cancer-driving genes within potentially large SCNA’s. The algorithm, called Helios, was used to analyze a combination of genomic data and information generated by functional RNAi screens, enabling them to predict several dozen new SCNA drivers of breast cancer. In follow-up in vitro experimental studies, they tested 12 of these predictions, 10 of which were validated in the laboratory. Their findings nearly double the number of breast cancer drivers, providing many new opportunities towards personalized treatments for breast cancer. Their methodology is general and could also be used to locate disease-causing SCNA’s in other cancer types.

Leading this effort was Felix Sanchez-Garcia, a recent PhD graduate from the Pe’er Lab and a first author on the paper. The story of how this breakthrough came about illuminates how the interdisciplinary research and education that take place at the Department of Systems Biology can address important challenges facing biological and biomedical research.

DIGGIT identifies mutations upstream of master regulators.

A new algorithm called DIGGIT identifies mutations that lie upstream of crucial bottlenecks within regulatory networks. These bottlenecks, called master regulators, integrate these mutations and become essential functional drivers of diseases such as cancer.

Although genome-wide association studies have made it possible to identify mutations that are linked to diseases such as cancer, determining which mutations actually drive disease and the mechanics of how they do so has been an ongoing challenge. In a paper just published in Cell, researchers in the lab of Andrea Califano describe a new computational approach that may help address this problem.

Ashkenazi Population Bottleneck Model
The consortium’s model of Ashkenazi Jewish ancestry suggests that the population’s history was shaped by three critical bottleneck events. The ancestors of both populations underwent a bottleneck sometime between 85,000 and 91,000 years ago, which was likely coincident with an Out-of-Africa event. The founding European population underwent a bottleneck at approximately 21,000 years ago, beginning a period of interbreeding between individuals of European and Middle Eastern ancestry. A severe bottleneck occurred in the Middle Ages, reducing the population to under 350 individuals. The modern-day Ashkenazi community emerged from this group.

An international research consortium led by Associate Professor Itsik Pe’er has produced a new panel of reference genomes that will significantly improve the study of genetic variation in Ashkenazi Jews. Using deep sequencing to analyze the genomes of 128 healthy individuals of Ashkenazi Jewish origin, The Ashkenazi Genome Consortium (TAGC) has just published a resource that will be much more effective than previously available European reference genomes for identifying disease-causing mutations within this historically isolated population. Their study also provides novel insights into the historical origins and ancestry of the Ashkenazi community. A paper describing their study has just been published online in Nature Communications.

The dataset produced by the consortium provides a high-resolution baseline genomic profile of the Ashkenazi Jewish population, which they revealed to be significantly different from that found in non-Jewish Europeans. In the past, clinicians’ only option for identifying disease-causing mutations in Ashkenazi individuals was to compare their genomes to more heterogeneous European reference sets. This new resource accounts for the historical isolation of this population, and so will make genetic screening much more accurate in identifying disease-causing mutations.

In an article that appears on the website of Columbia University’s Fu Foundation School of Engineering and Computer Science, Dr. Pe’er explains:

“Our study is the first full DNA sequence dataset available for Ashkenazi Jewish genomes... With this comprehensive catalog of mutations present in the Ashkenazi Jewish population, we will be able to more effectively map disease genes onto the genome and thus gain a better understanding of common disorders. We see this study serving as a vehicle for personalized medicine and a model for researchers working with other populations.”

In addition to offering an important resource for such future translational and clinical research, the paper’s findings also provide new insights that have implications for the much debated question of how European and Ashkenazi Jewish populations emerged historically.

Comparing human and mouse prostate cancer networks

Computational synergy analysis depicting FOXM1 and CENPF regulons from the human (left) and mouse (right) interactomes showing shared and nonshared targets. Red corresponds to overexpressed targets and blue to underexpressed targets.

Two genes work together to drive the most lethal forms of prostate cancer, according to new research by investigators in the Columbia University Department of Systems Biology.  These findings could lead to a diagnostic test for identifying those tumors likely to become aggressive and to the development of novel combination therapy for the disease.

The two genes—FOXM1 and CENPF—had been previously implicated in cancer, but none of the prior studies suggested that they might work synergistically to cause the most aggressive form of prostate cancer. The study was published today in the online issue of Cancer Cell.

“Individually, neither gene is significant in terms of its contribution to prostate cancer,” said co-senior author Andrea Califano, the Clyde and Helen Wu Professor of Chemical Biology in Biomedical Informatics and Chair of the Department of Systems Biology. “But when both genes are turned on, they work together synergistically to activate pathways associated with the most aggressive form of the disease.”

Co-principal investigator Andrea Califano discusses the new study.

“Ultimately, we expect this finding to allow doctors to identify patients with the most aggressive prostate cancer so that they can get the most effective treatments,” said co-senior author Cory Abate-Shen, the Michael and Stella Chernow Professor of Urologic Sciences and also a member of the Department of Systems Biology. “Having biomarkers that predict which patients will respond to specific drugs will hopefully provide a more personalized way to treat cancer.”

Molly PrzeworskiMolly Przeworski has joined Columbia University as Professor in the Department of Systems Biology and Department of Biological Sciences. The Przeworski lab investigates how natural selection, genetic drift, mutation, and recombination shape the heritable differences seen among individuals and species. To this end, they develop models of the evolutionary process, create statistical tools, and analyze large-scale variation data sets. Among the goals of their research are to understand how natural selection has shaped patterns of genetic variation, and to identify the causes and consequences of variation in recombination and mutation rates, in humans and other organisms.

Tuuli LappalainenTuuli Lappalainen has joined Columbia University as an assistant professor in the Department of Systems Biology. Dr. Lappalainen is a specialist in the analysis of RNA sequencing data, with research interests including functional variation in the human genome, population genetic background of variation in the human genome, and interpretation of genome function.

Dr. Lappalainen joins the Department of Systems Biology in co-appointment with the New York Genome Center (NYGC), where she will also serve as a Junior Investigator and Core Member. Based in lower Manhattan, NYGC is a consortium made up primarily of New York-area institutions that is designed to translate promising genomics-based research into new strategies for treating, preventing, and managing disease. This co-appointment with Columbia University — an institutional founding member of the NYGC — will enhance collaboration between the two institutions. (Read an interview with Dr. Lappalainen at the New York Genome Center website.)

Dr. Lappalainen earned her PhD in genetics at the University of Helsinki, Finland, and held appointments as a postdoctoral researcher in at the University of Geneva Medical School, Switzerland and at the Stanford University School of Medicine. She is the chair of the analysis group for the Genetic European Variation in Health and Disease (Geuvadis) Consortium’s RNA sequencing project, a member of the analysis group for the National Institute of Health’s Genotype Tissue Expression (GTEx) project, and a member of the analysis and functional interpretation groups for the 1000 Genomes Project.

Reversing glucocorticoid resistance

A representative example of tumor load analysis using bioluminescence imaging in mice following xenograft with T-ALL. Treatment with either MK2206 or dexamethasone showed limited efficacy, while combination treatment saw near complete elimination of tumor cells.

In a paper published in Cancer Cell, a team of researchers led by Adolfo Ferrando and Andrea Califano at Columbia University has identified the protein kinase AKT as a target for reversing resistance to glucocorticoid therapy in patients with acute lymphoblastic leukemia (ALL).  

Researchers in the Columbia University Department of Systems Biology and Herbert Irving Comprehensive Cancer Center have determined that measuring the expression levels of three genes associated with aging can be used to predict the aggressiveness of seemingly low-risk prostate cancer. Use of this three-gene biomarker, in conjunction with existing cancer-staging tests, could help physicians better determine which men with early prostate cancer can be safely followed with “active surveillance” and spared the risks of prostate removal or other invasive treatment. The findings were published today in the online edition of Science Translational Medicine.

More than 200,000 new cases of prostate cancer are diagnosed each year in the U.S. “Most of these cancers are slow growing and will remain so, and thus they do not require treatment,” said study leader Cory Abate-Shen, Michael and Stella Chernow Professor of Urological Oncology at Columbia University Medical Center (CUMC). “The problem is that, with existing tests, we cannot identify the small percentage of slow-growing tumors that will eventually become aggressive and spread beyond the prostate. The three-gene biomarker could take much of the guesswork out of the diagnostic process and ensure that patients are neither overtreated nor undertreated.”

Rabadan, Nature Genetics

An analysis of all gene mutations in nearly 140 brain tumors has uncovered most of the genes responsible for driving glioblastoma. The analysis found 18 new driver genes (labeled red), never before implicated in glioblastoma and correctly identified the 15 previously known driver genes (labeled blue). The graphs show mutated genes that are commonly found in varying numbers in glioblastoma (left), that frequently contain insertions (middle), and that frequently contain deletions (right). Genes represented by blue dots in the graphs were statistically most likely to be driver genes.

A team of Columbia University Medical Center researchers has identified 18 new genes responsible for driving glioblastoma multiforme, the most common—and most aggressive—form of brain cancer in adults. The study was published August 5, 2013, in the journal Nature Genetics.

The Columbia team used a combination of high-throughput DNA sequencing and a new method of statistical analysis developed by co-author Raul Rabadan, an assistant professor in the Department of Systems Biology, to generate a short list of candidate gene mutations that were highly likely to drive cancer, as opposed to mutations that have no effect.

Considering these results along with a previous study this group conducted, Rabadan and collaborators Antonio Iavarone and Anna Lasorella point out that approximately 15% of glioblastomas could now be targeted with drugs that have already been approved by the FDA. As Lasorella remarks in an article for the CUMC Newsroom, “There is no reason why these patients couldn’t receive these drugs now in clinical trials.”

Attractor Metagenes - DREAM7

Team Attractor Metagenes receives its award at the DREAM7 Conference. Gustavo Stolovitzky (IBM Research), Adam Margolis (Sage Bionetworks), Dimitris Anastassiou, Tai-Hsien Ou Yang, Wei-Yi Cheng, Stephen Friend (Sage Bionetworks), Erhan Bilal (IBM Research)

The team of Professor Dimitris Anastassiou and graduate students Wei-Yi Cheng and Tai-Hsien Ou Yang has been recognized as the best performer in the Sage Bionetworks – DREAM Breast Cancer Prognosis Challenge. This challenge, one of four organized as part of the seventh Dialogue for Reverse Engineering Assessments and Methods (DREAM7), was designed to assess the ability of participants’ computational models to predict breast cancer survival using patient clinical information and molecular profiling data. As a reward for this accomplishment, the journal Science Translational Medicine has just published a paper from the Anastassiou lab describing their model. It is also the journal’s cover theme for this issue, which includes a second article describing the Challenge.

The Columbia University researchers based their DREAM entry on previous work to identify what they call “attractor metagenes,” sets of strongly co-expressed genes that they have found to be present with very little variation in many cancer types. Moreover, these metagenes appear to be associated with specific attributes of cancer including chromosomal instability, epithelial-mesenchymal transition, and a lymphocyte-specific immune response. As Wei-Yi Cheng comments in Sage Synapse, “We like to think of these three main attractor metagenes as representing three key ‘bioinformatic hallmarks of cancer,’ reflecting the ability of cancer cells to divide uncontrollably and invade surrounding tissues, and the ability of the organism to recruit a particular type of immune response to fight the disease.”

Genes forming cluster I in the context of cellular signaling pathways

Genes forming cluster I in the context of cellular signaling pathways. Proteins encoded by cluster genes are shown in yellow, and those corresponding to other relevant genes that were present in the input data but not selected by the NETBAG+ algorithm are shown in cyan. 

In a new paper published in the journal Nature Neuroscience, Columbia University researchers report that many of the genes that are mutated in schizophrenia are organized into two main networks. Surprisingly, the study also found that a genetic network that leads to schizophrenia is very similar to a network that has been linked to autism. 

Using a computational approach called NETBAG+, Dennis Vitkup and colleagues performed network-based analyses of rare de novo mutations to map the gene networks that lead to schizophrenia. When they compared one schizophrenia network to an autism network described in a study he published last year, they discovered that different copy number variants in the same genes can lead to either schizophrenia or autism. The overlapping genes are important for processes such as axon guidance, synapse function, and cell migration — processes within the brain that have been shown to play a role in the development of these two diseases. These gene networks are particularly active during prenatal development, suggesting that the foundations for schizophrenia and autism are laid very early in life.

Itsik Pe'erItsik Pe'er, an Associate Professor in the Department of Computer Science and member of the Columbia Initiative in Systems Biology, is using mathematics and computer analytics to identify the genetic makeup of the founding Ashkenazi Jews. By analyzing the full DNA sequences of hundreds of their descendants in the New York City area and comparing them to reference sets of non-Ashkenazi DNA, his goal is to identify Ashkenazi-specific genetic mutations associated with diseases such as Tay-Sachs, Crohn's, and Parkinson's disease. As a new article in Columbia News explains:

By examining similarities in DNA segments shared by large numbers of related individuals, his lab developed statistical models that allow him to make generalizations about entire populations. The mix of genes that every child inherits from each parent travels in long sequences of code that remain together and are remarkably consistent from one generation to the next.

"The size of the gene chunks gets smaller with each generation, but they diminish at a consistent and predictable rate. As a result, Pe’er can use his models to determine distant relationships shared by two individuals by measuring the length of their common DNA segments."

Read the complete article here.

Transforming activity of FGFR-TACC fusion proteins

Representative microphotographs of hematoxylin and eosin staining of advanced FGFR3-TACC3-shp53–generated tumors show histological features of high-grade glioma.

A new paper published by Columbia University Medical Center researchers in the journal Science has determined that some cases of glioblastoma, the most aggressive form of primary brain cancer, result from the fusion of the genes FGFR and TACC. Raul Rabadan, a co-senior author on the study, led efforts to identify these genes by using quantitative methods to analyze the glioblastoma genome from nine patients, and then compare these results with more than 300 genomes from the Cancer Genome Atlas project.

The collaboration with cancer genomics expert Antonio Iavarone and co-senior author Anna Lasorella found that the protein produced by the FGFR-TACC fusion disrupts the mitotic spindle (the cellular structure that guides mitosis) and causes aneuploidy, an uneven distribution of chromosomes that causes tumorigenesis. The researchers also found that drugs that target this aberration can dramatically slow the growth of tumors in mice, suggesting a potential therapeutic target.

Gene clusters found using NETBAG analysis of de novo CNV regions observed in autistic individuals.

Gene clusters found using NETBAG analysis of de novo CNV regions observed in autistic individuals. A) The highest scoring cluster obtained using the search procedure with up to one gene per each CNV region. B) The cluster obtained using the search with up to two genes per region.

Identification of complex molecular networks underlying common human phenotypes is a major challenge of modern genetics. A new network-based method developed at the lab of Dennis Vitkup was used to identify a large biological network of genes affected by rare de novo copy number variations (CNVs) in autism. The genes forming the network are primarily related to synapse development, axon targeting, and neuron motility. The identified network is strongly related to genes previously implicated in autism and intellectual disability phenotypes.

These findings are consistent with the hypothesis that significantly stronger functional perturbations are required to trigger the autistic phenotype in females compared to males. Overall, the analysis of de novo variants supports the hypothesis that perturbed synaptogenesis is at the heart of autism.

Systematic characterization of cancer genomes has revealed a staggering number of diverse alterations that differ among individuals, so that their functional importance and physiological impact remains poorly defined. In order to identify which genetic alterations are functional, the lab of Dr. Dana Pe’er has developed a novel Bayesian probabilistic algorithm, CONEXIC, to integrate copy number and gene expression data in order to identify tumor-specific “driver” aberrations, as well as the cellular processes they affect.

In work published in the journal Cell, the new method was applied on data from melanoma patients, identifying a list of 64 putative ‘drivers’ and the core processes affected by them. This list includes many known driver genes (e.g., MITF), which CONEXIC correctly identified and paired with their known targets. This list also includes novel ‘driver’ candidates including Rab27a and TBC1D16, both involved in protein trafficking. ShRNA-mediated silencing of these genes in short-term tumor-derived cultures determined that they are tumor dependencies and validated their computationally predicted role in melanoma (including target identification), suggesting that protein trafficking may play an important role in this malignancy.