Ingenio NGS Run Analysis Report
Section 1: Workflow Summary
Samples were processed according to the workflow below. PolyA-tail selection was used to separate extracted mRNA for reverse transcription to cDNA. Samples were ligated with Illumina TruSeq RNA CD barcode adapters and assessed for quality via Agilent TapeStation and Thermofisher Qubit prior to sequencing.
Figure 1.1: RNA Sequencing Process
Section 2: Sequencing Overview
Following the wetlab workflow, samples were sequenced on an Illumina NextSeq550 instrument and demultiplexed into Fastq files using Local Run Manager bcl2fastq v.1.8.4. Fastq files were assessed for read type, quantity and quality before proceeding with additional downstream analysis.
Sample ID |
Total Read Pairs |
Total Bases |
Unique Exonic Rate |
Bases >Q30 Rate |
Sample1A |
32441484 |
7438531874 |
68.1% |
93.4% |
Sample1B |
32010165 |
7342693975 |
61.1% |
93.8% |
Sample2A |
28250897 |
6738027868 |
75.4% |
94% |
Sample2B |
28718004 |
6375599965 |
42.6% |
92.8% |
Sample3A |
26598050 |
6347449662 |
74% |
93.9% |
Sample3B |
27951890 |
6622742671 |
71.3% |
94% |
Sample4A |
27237497 |
6357419924 |
67.3% |
94% |
Sample4B |
30260137 |
6956166950 |
67.2% |
94.4% |
Table 2.1: Sample Sequencing Statistics.
Figure 2.1: Breakdown of RNA types, as well as a measure of contamination rate (rRNA).
Section 3: Sample Quality Review
FastQC v0.11.5 was run to assess sample quality and produce reviewable sample run metrics. Presented below are charts summarizing sample Phred quality score over the length of the reads, the general distribution of Phred quality scores per sample and the general distribution of GC content % per sample.
Figure 3.1: Per Base Sequence Quality.
Figure 3.2: Per Sequence Quality Scores.
Figure 3.3: Per Sequence GC Content.
Section 4: Alignment Overview
Using CutAdapt v3.2, sample reads with low quality or length were removed and the remaining reads were trimmed for Illumina read-end issues. Trimmed reads were aligned to the Mus musculus GRCm39 reference genome with STAR v.2.7.8a. Resulting BAM (Binary Alignment Map) files were assessed for alignment metrics.
Sample ID |
Total Mapped Pairs |
MEND Pairs |
Duplicate Mapping Rate |
Alternative Alignments |
Genes Detected |
Sample1A |
24920790 |
11013369 |
35.1% |
5327953 |
19610 |
Sample1B |
24593638 |
10362724 |
31.1% |
6061740 |
17850 |
Sample2A |
22551624 |
12235349 |
28% |
4645845 |
17408 |
Sample2B |
21359721 |
6712908 |
26.2% |
7065037 |
23006 |
Sample3A |
21250632 |
11475450 |
27% |
4649572 |
17220 |
Sample3B |
22168051 |
11377332 |
28% |
4593455 |
19875 |
Sample4A |
21299899 |
10252892 |
28.5% |
4575012 |
20058 |
Sample4B |
23292745 |
10572972 |
32.4% |
6327077 |
17435 |
Table 4.1: Alignment Statistics.
Figure 4.1: Read QC per sample.
Figure 4.2: Mapping rate per sample.
Section 5: Differential Expression Overview
Following alignment review, read counts on a gene level were assembled using featureCounts v1.6. Sample data was grouped according to the provided specifications and compared for differences in expression with DESeq2 v1.2.4.0. Only genes with p.adj < 0.05 and | log2FoldChange | > 1 were considered significant. Additional enrichment/pathway analyses were also performed to assess larger-scale functional changes.
Comparison |
Significant Genes |
Significant Ontologies |
Significant KEGG Pathways |
TreatmentvsControl |
22551 |
673 |
17 |