Sequencing Data

RNA Services | Sequencing Data

Each RNA application is offered with a standard choice of sequencing length and depth which is optimized for specific applications. The read length and number of reads are chosen to maximize the quality of resulting data analysis, allow detection of more lowly-expressed transcripts, and show a complete and accurate view of the state of RNA in a given sample. Applications with more complexity in the resulting library will require additional sequence. For example, a poly(A)-selected RNA-Seq experiment will generate good coverage of mainly mature mRNAs with only 50M PE reads. However, a rRNA reduction experiment will capture non-coding RNAs in addition to poly(A) mRNAs and therefore requires greater sequencing depth to provide adequate coverage of all the RNA species represented in the library. Please refer to the table below for Discovery-recommended sequencing conditions.

Sequencing Application	Standard Sequencing Condition	Standard Number of Reads
RNA-Seq (standard or directional)
with Poly(A) selection	50bp PE or 100bp PE	25M PE
with rRNA reduction	50bp PE or 100bp PE	50M PE
Low-input RNA-Seq	50bp PE or 100bp PE	50M PE
small RNA-Seq	50bp SE	15M SE

Discovery RNA-Seq Data

All demultiplexing (i.e. the sorting of indexed reads) is included in the cost of basic RNA-Seq, with users receiving fastq files for each sample. Additional fees are charged for analysis, including alignment. See Discovery Data Analysis Services, below.

Discovery miRNA-Seq Data

All demultiplexing (i.e. the sorting of indexed reads) is included in the cost of basic RNA-Seq, with users receiving .fastq files for each sample. Small RNA samples are run under single-end, 50bp conditions. Additional fees are charged for analysis, including trimming, alignment, and interpretation of results.

MicroRNA sequencing data requires special handling. The fragments being sequenced are 15-25bp in length and 50bp of sequence is provided in the fastq file. Therefore, sequencing data will contain adapter sequence at the end of the miRNA sequence as a result of reading through the fragment and into the opposite adapter. The adapter used in library construction is as follows:

Illumina miRNA 3'RC
gtgactggagttcagacgtgtgctcttccgatct

It will be read off the sequencer as the reverse complement, which is:

agatcggaagagcacacgtctgaactccagtcac

This sequence should be trimmed from the reads in the fastq file to clean the data. A list of trimming software can be found here. It is also possible to find that the first 4-5 bp of the adapter sequence varies slightly. It should always be AGATC, but errors in sequencing happen especially frequently if there are cycles in which a large majority of the clusters are showing the same base, as might happen if the majority of the fragments were 15bp. It would be also be appropriate to find the GGAAG and then trim both from it to the end of the read and before it, including 5bp before the GGAAG sequence.

Discovery Data Analysis Services

Discovery Genomics provides options to its customers for both basic as well as advanced data analysis for RNA-Seq, miRNA- Seq and Chip-Seq experiments. Commercial software packages are used to intensively analyze NGS data and results are provided in customer determined formats and figures . All analysis begins with one-on- one discussion between our analysis team and customers to better understand their experimental set-up, objectives and requirements. The basic analysis is available as fee-for-service, with customers paying for staff time as well as computational time. This includes importing data, quality control estimation, mapping and alignment, filtering , grouping samples (if applicable) and differential expression analysis. Any further advanced analysis requires much intense data mining and thus becomes collaborative, with customers acknowledging staff members’ contributions and expertise with appropriate authorship on manuscripts. Fees for basic analysis will still be applied for all collaborative projects.