Genome Sequencing and Analysis
Frequently Asked Questions
How do I choose an RNA-seq package?
What is the difference between RNA-seq options #121-124 and #133-134, and which should I choose?
#121-124 use poly-A pull-down for mRNA enrichment, while #133-134 use rRNA depletion to remove ribosomal RNA.
Both workflows retain strand information of the transcripts.
If your experiment's goal is coding gene expression profiling and you have good quality total RNA samples (RIN>8 by Agilent Bioanalyzer), you most likely only need packages #121-124.
If your RNA samples are degraded, it means that they are losing their poly-A tails. Therefore, no matter the goal of your experiment, you should choose #133-134.
If your experimental goal is not only coding gene expression but also lncRNAs, you should choose #133-134, regardless of your RNA quality.
Should I choose paired end (2 x 100 bp) or single read (100 bp) sequencing?
If your experimental goal is gene expression profiling, you most likely only need single read packages.
If you are also interested in detecting structural variations, alternative splicing patterns, or even point mutations, you should choose paired end packages.
What depth do I need?
For coding gene expression profiling of human/mouse/rat, the 30M single read option (#121) provides sufficient coverage.
Deeper sequencing is recommended if your genes of interest are expressed in lower quantities.
Because many non-coding RNA species are found along with mRNA, the rRNA depletion protocol requires deeper sequencing.
Why do some packages require pre-approval?
Some packages are designed for users with significant informatics experience. We can only offer them to users known to us who are able to handle the increased informatics workload without our help.
The packages that REQUIRE pre-approval are #120, #131-132, #401-402, #421, and #431. Please do not choose these without speaking with a member of the Genome Center first.
I don’t have 10 ng input RNA. Do you offer RNA amplification services?
If you have less than 50 ng of input RNA, we offer an option for RNA amplification using Clontech’s SMART-seq v4 Ultra Low Input RNA Kit to create amplified cDNA, followed by library preparation using Illumina’s Nextera XT kit. The minimum input required for this kit is 10 pg, but >200 pg is preferred. Please choose package #141.
If you have more than 50 ng input RNA but less than 100 ng (and RIN >8), we have had success with option #121 but cannot guarantee it will yield a sequenceable library with sufficient complexity. We can try this option and save 10% of your starting material to do option #141 in the event #121 doesn’t work.
We recommend choosing all option #141 if all of your samples are very low input; if you have only one or two low-input samples in a larger batch with enough material, choosing to try #121 for all samples (saving input material to do #141) is preferable.
Do you accept FFPE RNA for RNA-seq?
For degraded and FFPE RNA, we offer the TruSeq Stranded Total RNA with Ribo-Zero Gold prep (packages #133-134). This uses Ribo-Zero technology to deplete the rRNA (both cytoplasmic and mitochondrial), as the normal protocol will not work due to loss of the poly-A tails. We still require a Bioanalyzer report. We accept RINs of <8 for this protocol, but cannot guarantee quality as the quality of FFPE is so variable. Sequencing is required at a higher depth due to remaining non-coding RNA. (For this reason, we recommend starting at 60M paired-end reads).
How should I design my RNA-seq experiment?
Is it OK to collect samples in batches?
It is best if you can collect samples at once, but it is common practice that researchers collect/extract RNAs at different times using the same protocol.
Can I compare data across batches?
Please choose the same package repeatedly in order to compare different submissions. In addition, please let us know at the time of submission if you’d like to compare the current project with a previous project to ensure that we use the same tools for data analysis.
We also include an RNA spike-in control in each sample, which can be used for normalization.
Do I need biological or technical replicates? If so, how many?
Yes, biological replicates are strongly recommended. We suggest at least 3 biological replicates for each treatment/group. We do not make technical replicate libraries from the same sample, as next-generation sequencing technology itself is highly reproducible.
Submitting your samples
What are the submission requirements for my samples?
Please see Submitting Your Samples for information.
Do I need to do a Bioanalyzer? Will a TapeStation report be sufficient?
Yes (for RNA samples), a Bioanalyzer report is preferred. For RNA samples we require that the Bioanalyzer report not be older than a week at the time of submission. If samples have undergone a freeze/thaw cycle or have been shipped since then we strongly recommend rerunning the sample. We cannot provide a guarantee for data quality if the above conditions are not met.
A TapeStation report will suffice. However, we cannot provide a guarantee for data quality.
A Bioanalyzer report is not required for gDNA samples, but is required for cDNA samples.
How do I Bioanalyze my samples?
There are two options for Columbia users.
Researchers can send samples to the Molecular Pathology Core. If you dilute your samples, please indicate this in the sample name.
If you would like to purchase your own Bioanalyzer reagents, you may use our Bioanalyzer as a self-service instrument. Please contact firstname.lastname@example.org to set up an appointment. This is recommended for advanced users only.
External users can submit a request for the Genome Center to perform the Bioanalyzer for them. This service is $30 per sample.
Custom projects/self-prepared libraries
What should I do if my request is not RNA-seq/exome sequencing/whole genome sequencing?
If you are interested in ChIP-seq, panel sequencing, or other applications for which we don’t have a full service, we recommend preparing your own libraries with Illumina platform-compatible protocols, and we will be happy to sequence them for you. Please see below for self-prepared library requirements.
What should I do with self-prepared libraries?
You will need to quantify self-prepared libraries correctly, as this is critical for the success of a sequencing run. We recommend using qPCR, or using the average fragment length from the Bioanalyzer with a fluorescent reading to calculate the molarity (Qubit or PicoGreen).
We require at least a 10 nM pooled library in 10 μl water. We will load the sampled based on your quantification. We require all index information at the time of submission (name of index and sequence) and a Bioanalyzer report of your pooled library. Please complete the Self-Prepared Library Sample Form and send it to email@example.com.
How can I make use of your NextSeq service?
You have two options for accessing the Columbia Genome Center's NextSeq capabilities.
1. Have Columbia Genome Center staff run your sequencing.
We will charge you for what we pay for the kit plus labor (see Pricing).
We require at least a 10 nM pooled library in 10 μl water. We will load the samples based on your quantification. We require all index information at the time of submission (name of index and sequence) and a Bioanalyzer report of your pooled library.
You can create a run on BaseSpace and share the login and password information with us, which allows you to download your data directly from BaseSpace. We can also run the data through our pipeline and provide basic data analysis.
2. Become a self-user of NextSeq instruments.
To become a Next-seq user please:
- Review Illumina’s online tutorial videos and user guides.
- Create or become familiar with your lab’s BaseSpace account.
- Observe another self-user setting up a run. (If you do not have a lab member to observe, please contact us.)
- Let us know when your first run is. If you are unassisted, we will help you set up the instrument. Please contact Illumina for BaseSpace set-up and support (firstname.lastname@example.org or 1-800-809-4566).
Once you have successfully completed your first run, we will add you to our user list and establish key-card access to the sequencing room.
What is your turnaround time?
Over the past year, our average turnaround time for standard RNA-seq has been 2-3 weeks from the time we receive your samples to the time we deliver the data.
If you need a project delivered by a certain date, please contact us and we can try to accommodate you to the best of our abilities. Please let us know this at the time of submission so we can plan properly.
What are your deliverables?
FASTQ, BAM, counts table, and list of differentially expressed genes if sample group information is given
Whole exome sequencing (human)
FASTQ, BAM, and VCF (germline variants)
Whole genome sequencing (human)
FASTQ and BAM
How do you release the data? How long will you store our data? Can we get our samples back?
We will give you a web link to download the data. You are also welcome to bring over a hard drive and we can transfer the data to the drive.
Data will be stored on our server for 1 month before it is deleted permanently.
Remaining samples will be stored for 1 week before being discarded. Your data release email will include a note indicating that you can pick up any remaining sample material. We will store libraries for 6 months, after which time they will be discarded without warning.