Genome Sequencing and Analysis
The Columbia Genome Center, located at Columbia University Medical Center in New York City, provides high-quality and cost-effective next-generation genome, exome, and RNA sequencing services. Our clients and collaborators include researchers in the Columbia University community, as well as investigators at other universities and in the biotechnology and pharmaceutical industries.
Next-generation DNA and RNA sequencing have made it possible not only to look at individual genomes, but also to rapidly compare genetic sequences among multiple genomes. These approaches can be used, for example, to determine differences in genomes and gene transcripts from person to person, between populations, and between normal and pathologic cells in cancer. As next-generation sequencing technologies have made it possible to generate high-resolution genomic data much more efficiently, researchers in many fields have turned to next-generation sequencing to identify particular features of the genome that contribute to specific phenotypes.
Whole genome and whole exome sequencing
Next-generation DNA sequencing makes it possible to rapidly compare the genetic content among samples and identify germline and somatic variants of interest, such as single nucleotide polymorphisms (SNPs), insertions and deletions (indels), copy number variants (CNVs), and other structural variations.
Next-generation sequencing technologies have made it possible to generate high-resolution genomic data much more efficiently.
In next-generation DNA sequencing, the DNA is first broken into a library of small fragments. These fragments are then attached to oligonucleotide adapters that facilitate the biochemistry necessary for the sequencing reaction. After being placed on a slide or in a flow cell, the strings of nucleotide bases that make up the fragments are then sequenced in hundreds of millions of parallel reactions. In re-sequencing studies, where a high-quality reference genome exists, the reads from the machine are mapped to the reference genome based on sequence alignment, which in turn is used for calling variants. In de novo genome sequencing studies, the reads are assembled to form a draft genome.
Next-generation technologies can quickly generate a sequence of a whole genome, or can be more targeted using an approach called exome sequencing. Exome sequencing focuses specifically on generating reads from known coding regions. In contrast to whole genome sequencing, which sequences the entire genome, exome sequencing is a cost-effective approach that can detect single nucleotide or short indel variants in coding regions, and provides sufficient information for many research needs.
RNA-seq is a next-generation sequencing technique that measures the abundance of RNA transcripts in a sample. It is a powerful tool for understanding dynamics in the transcriptome, including gene expression level difference between different physiologic conditions, or changes that occur during development or over the course of disease progression. Specifically, this application can be used to study phenomena such as:
- gene expression changes
- alternative splicing events
- allele-specific gene expression
- chimeric transcripts, including gene fusion events
- novel transcripts
- RNA editing
In a standard RNA-seq procedure, total RNA first goes through a poly-A pull-down for mRNA purification, and then goes through reverse transcription to generate cDNA. The cDNA is broken into a library of small fragments, attached to oligonucleotide adapters that facilitate the sequencing reaction, and then sequenced either single-ended or pair-ended. Finally, the reads are aligned to a reference genome to estimate the expression level of known or novel transcripts. The results can indicate differences in transcriptional structure and/or in expression levels of specific genes. To study noncoding RNAs that lack poly-A tails, the poly-A pull-down step is replaced with a ribosomal RNA reduction experiment.
Our sequencing platforms
We use a variety of successful and widely adopted tools for conducting next-generation sequencing. Our technical infrastructure includes:
Illumina HiSeq 2500
An industry-standard platform for next-generation sequencing, the HiSeq is designed for large-scale high-throughput experiments. Using Illumina v4 chemistry, it can generate up to 1 Tb of data in 6 days.
Illumina NextSeq 500
The NextSeq 500 is a flexible and efficient desktop sequencer that offers powerful high-throughput sequencing capabilities. Designed with the individual laboratory in mind, it is the first high-throughput desktop sequencer to offer exome, transcriptome, and whole genome sequencing in a compact package. It offers a variety of flow cell configurations including up to 800 million paired-end reads and up to 400 million single reads. The Columbia Genome Center offers self-service access to the NextSeq 500 for experienced Columbia University researchers.
No mimimum sample size
We welcome next-generation sequencing projects of all sizes, including both large and small runs. Unlike larger industrial sequencing facilities, the Columbia Genome Center has no minimum sample size. This makes it cost-effective to perform advanced, higher-risk applications that are often more expensive at other centers.
We also encourage pilot projects: You can test an experiment and our facilities by sequencing just a small number of samples for the same price per sample as a more extensive experiment. This flexibility is often particularly useful for researchers who are new to the technology.
Next-generation sequencing services at the Columbia Genome Center are for research only. All documents, emails, sample names, and data that you send through our submission form or by email must be HIPAA compliant. All sample names must be fully anonymized (no patient names).
We perform extensive quality assurance on our equipment and quality control of samples throughout the sequencing process. Our goal is always to deliver useful data that can aid you in your research.
At the same time, next-generation sequencing by its very nature does not always deliver equally robust results. A variety of factors, including the variability inherent in the library preparation process, the biochemistry that takes place during sequencing, and the quality of the DNA and RNA samples that we receive can affect the outcomes of a given experiment.
We will make every effort to deliver results that meet agreed upon targets, although it is not possible to guarantee that those targets will always be achieved.
The Genome Center receives NCI funding through the Genomics Shared Resource of the NCI-designated Herbert Irving Comprehensive Cancer Center.
We are an Illumina CSPro Certified Service Provider, dedicated to ensuring delivery of the highest-quality data available for genetic analysis applications.