Leading Edge Genomic Services & Solutions

Whole Exome Sequencing

Service Overview
Novogene Data
Contact Us

Exome Sequencing ServiceExome sequencing provides a cost-effective alternative to whole genome sequencing as it targets only the protein coding region of the human genome responsible for a majority of known disease related variants. Whether you are conducting studies in rare Mendelian disorders, complex disease, cancer research, or human population studies, Novogene’s comprehensive human whole exome sequencing service provides a high-quality, affordable and convenient solution.

Novogene’s bioinformatics analysis includes data QC, mapping with reference genome, SNP/InDel, somatic SNP/InDel calling, statistics and annotation. Novogene utilizes internationally recognized software in bioinformatics analysis, e.g. BWA, SAMtools, GATK, etc.

In particular, Novogene bioinformatics pipeline includes annotation with the exome aggregation consortium (ExAC). ExAC dataset spans 60,706 unrelated individuals sequenced as part of various disease-specific and population genetic studies. This population scale database greatly facilitates research of disease pathogenesis.

The Novogene Advantage

  • Unsurpassed data quality: We guarantee a Q30 score ≥80%, exceeding Illumina’s official guarantee of ≥75%. See our data example.
  • State-of-the-art exome capture: Agilent SureSelect Human All Exome V6 (58 M) is used.
  • Accurate variant calling with longer read length up to 150 bp.
  • Extraordinary informatics expertise: Novogene uses its cutting-edge bioinformatics pipeline and internationally recognized best-in-class software to provide customers with “publication-ready data”.

Project Workflow

Exome Sequencing Service Project Workflow

Exome Capture

  • Agilent SureSelect Human All Exon V6 Kit

Sequencing Strategy

  • 180~280 bp insert DNA library
  • HiSeq platform, paired-end 150 bp

Data Quality Guarantee

  • We guarantee that ≥ 80% of bases have a sequencing quality score ≥ Q30, which exceeds Illumina’s official guarantee of ≥ 75%.

Sample Requirements

  • Input DNA:
    • For fresh samples: ≥ 1.0 μg
    • For FFPE samples: ≥ 1.5 μg
  • DNA concentration: ≥ 20 ng/μl
  • DNA volume: ≥ 20 μl
  • Purity: no degradation or RNA contamination; fragments should be longer than 1,500 bp for FFPE samples

Turnaround Time

  • Within 22 working days after verification of sample quality (without data analysis)
  • Additional 5 working days for data analysis

Recommended Sequencing Depth

  • For Mendelian disorder/rare disease: effective sequencing depth above 50×
  • For tumor sample: effective sequencing depth above 100×

Analysis pipeline

Exome Sequencing Service Analysis Pipeline

Advanced Analysis

Monogenic disorders

1. Variant filtering
2. Analysis under dominant/recessive model (Pedigree information is needed)
   2.1 Analysis under dominant model
   2.2 Analysis under recessive model
3. Functional annotation of candidate genes
4. Pathway enrichment analysis of candidate genes
5. Linkage analysis
6. Regions of homozygosity (ROH) analysis

Complex/multifactorial disorders

1. Variant filtering
2. Analysis under dominant/recessive model (Pedigree information is needed)
   2.1 Analysis under dominant model
   2.2 Analysis under recessive model
3. Functional annotation of candidate genes
4. Pathway enrichment analysis of candidate genes
5. De novo mutation analysis (Trio/Quartet)
   5.1 De novo SNP/InDel detection
   5.2 Calculation of de novo mutation rates
6. Protein-protein interaction (PPI) analysis
7. Association analysis of candidate genes (at least 20 trios or case/control pairs)

Cancer (for tumor-normal pair samples)

1. Screening for predisposing genes
2. Mutation spectrum & mutation signature analyses
3. Screening for known driver genes
4. Analyses of tumor significantly mutated genes
5. Analysis of copy number variations (CNV)
   5.1. Analysis of CNV distribution
   5.2.Analysis of CNV recurrence
6. Fusion gene detection (for WGS porject only)
7. Purity & ploidy analyses of tumor samples
8. Tumor heterogeneity analyses
9. Tumor evolution analysis
10. Display of genomic variants with Circos


Novogene provides the highest quality NGS services. We guarantee that over 80% of bases will have a sequencing quality score ≥ Q30. In standard practice, Novogene achieves an average Q30 of 91.07%, exceeding Illumina’s official guarantee of 75%.

Additionally, an average of 98.3% of our raw sequencing data passes the quality control standards for effective clean data.

The following table includes data from our whole exome sequencing service projects, and demonstrates the quality of our sequencing. Alignment of the results to the reference genome (UCSC hg19) showed an average mapping ratio of 99.71%.

Table. Representative human whole exome sequencing data from Novogene
Table - Whole Exome Sequencing

1 Original sequencing data (in gigabases).
2 Percentage of clean reads from all raw reads.
3 Average error rate of all bases in read1 and read2.
4 Percentage of reads with an average quality greater than Q20.
5 Percentage of reads with an average quality greater than Q30.
6 Percentage of G and C bases from total bases.
7 Percentage of total reads that mapped to the reference genome.
8 Average sequencing depth (times coverage) per reference genome target region.
9 Percentage of target region covered by sequencing.
10 Percentage of bases in target region with a sequencing depth ≥ 4x.
11 Percentage of bases in target region with a sequencing depth ≥ 10x.
12 Percentage of bases in target region with a sequencing depth ≥ 20x.

Project Example

The following studies utilized Novogene's expert exome sequencing service.

Single-cell exome sequencing identifies mutations in KCP, LOC440040, and LOC440563 as drivers in renal cell carcinoma stem cells
Cell Research 1-4 (2016)

Renal cell carcinoma (RCC), the most common form of adult kidney cancer, has a low mutation rate. In this study, three novel renal cancer stem cell driver mutations were discovered using Novogene’s advanced single-cell exome sequencing technology. With over 140X coverage, 297 somatic SNVs were found, with 141 of these located in coding regions. Three missense mutations in the loci KCP, LOC440563, and LOC440040 were unique to CD133+ RCC cells and have not been reported in RCC before. This study suggests that these three novel mutations could play significant roles in RCC diagnostics and therapeutic treatment.

 

WES Project Example Fig 1

Figure. Identification of driver genes in renal cell carcinoma stem cells via single-cell exome sequencing.


Simultaneous evolutionary expansion and constraint of genomic heterogeneity in multifocal lung cancer
Nature Communications 8:823 (2017)

Tumors are genetically unstable, providing the potential to accumulate novel mutations for expansion; however, their heterogeneity can also be constrained by their functionality as they adapt to environmental pressures. This study explored the simultaneous evolutionary expansion and constraints of tumor genomic heterogeneity in a cohort of multiple synchronous lung cancers (MSLCs). Independent clonality and profound genomic heterogeneity at each multicentric primary tumor, and novel mutations with therapeutical potential, are revealed by Novogene’s whole exome sequencing (WES) when comparing tumors with matched adjacent normal lung DNA. Independent validation of oncogenic pathway convergence with whole genome sequencing (WGS) also indicates that selection for functional convergence plays a significant role in the constraints of new mutations and genomic heterogeneity during oncogenesis. The paper provides exciting insights into utilizing WGS and WES as tools in understanding tumor evolution and finding novel mutations that are therapeutically targetable for further studies.

WES Project Example

Figure. Mutational landscape of all 16 sequenced tumor regions. Putative driver genes with somatic mutations were classified according to the functional categories.

Examples of Publications Using Novogene’s Services

JournalTitle
Molecular Neurobiology, 53:5097-5102 (2015)Identification of a novel mutation in the titin gene in a chinese family with limb-girdle muscular dystrophy 2J.
Human Molecular Genetics, 25:1875-1884 (2016)Whole exome sequencing identifies lncRNA GAS8-AS1 and LPAR4 as novel papillary thyroid carcinoma driver alternations.
The Journal of Pathology, 239:72-83 (2016)Clonality analysis of multifocal papillary thyroid carcinoma by using genetic profiles.
Cell Research, 1-4 (2016)Single-cell exome sequencing identifies mutations in KCP, LOC440040, and LOC440563 as drivers in renal cell carcinoma stem cells.
Gastroenterology, 153(1):166-177 (2017)Genetic alterations as esophageal tissues from squamous dysplasia to carcinoma.
Nature Communications 8:823 (2017)Simultaneous evolutionary expansion and constraint of genomic heterogeneity in multifocal lung cancer.
  Whole Genome Sequencing on HiSeq X (Human/ Animal/ Plant)
  Whole Exome Sequencing
  mRNA-Seq
  LncRNA Sequencing
  Small RNA Sequencing
  Whole Genome Bisulfite Sequencing
  ChIP-Seq
  Animal & Plant Re-Sequencing
  de novo Sequencing
  Pan-genome Sequencing
  Metagenomic Sequencing
  Single-cell DNA Sequencing
  Single-cell RNA Sequencing
  16S/18S/ITS Amplicon
  HiSeq Lane Sequencing
  NovaSeq Flowcell Sequencing
  Others- please specify
  Human
  Others