Genome Sequencing: Defining Your Experiment
Although next-generation sequencing follows certain standard protocols, it can also be customized to address individual research needs, including factors such as the specific questions you are trying to answer, the depth of the sequencing that you require to achieve your research goals, and your budget. When you contact us to receive a price quote, we will ask what read length will be required, whether you will need single-end or paired-end reads, and the necessary depth of coverage.
If you are new to sequencing and have questions about the best way to structure your experiment, please contact us. We are conveniently located in the Ross Berrie Medical Science Pavilion and would be glad to arrange a meeting, discuss your project with you, and recommend the best approach.
During sequencing, it is possible to specify the number of base pairs that are read at a time. For example, one read might consist of 50 base pairs, 100 base pairs, or more. Longer reads can provide more reliable information about the relative locations of specific base pairs. (This helps to address a common challenge that arises in sequencing because the same read sequences can appear in multiple places within a genome.) However, it is usually more expensive to generate longer reads.
Single-end vs. paired-end reading
In single-end reading, the sequencer reads a fragment from only one end to the other, generating the sequence of base pairs. In paired-end reading it starts at one read, finishes this direction at the specified read length, and then starts another round of reading from the opposite end of the fragment. Paired-end reading improves the ability to identify the relative positions of various reads in the genome, making it much more effective than single-end reading in resolving structural rearrangements such as gene insertions, deletions, or inversions. It can also improve the assembly of repetitive regions. This degree of accuracy may not be required for all experiments, however, and paired-end reads are more expensive and time-consuming to perform than single-end reads.
Depth of coverage
The depth of coverage is a measure of the number of times that a specific genomic site is sequenced during a sequencing run. In exome sequencing, for example, the target might be 60X coverage, meaning that — on average — each targeted base is sequenced 60 times. This does not mean that every targeted base is sequenced every time; some segments may be read 100 or more times, while others might only be read once or twice, or not at all. In exome sequencing our average target is that 85% of targeted bases are covered at least 15 times, and 90% of targeted bases are covered at least 10 times. The higher the number of times that a base is sequenced, the better the quality of the data.
For RNA-seq, we generally recommend a minimum of 20 million reads per sample. For sequencing projects that require higher accuracy — such as studies of alternate splicing — 40 million to 60 million paired-end reads will provide better results. For more detailed analyses to determine, for example, allele-specific expression or expression of low-abundant transcripts, 60 million to 100 million reads may be required.