Whole Genome Sequencing ServiceHuman whole genome sequencing enables researchers to catalog the genetic constitution of individuals and capture all the variants present in a single assay. It is applied to the study of cancer and a variety of diseases, as well as human population evolution studies and pharmacogenomics.

Equipped with 30 powerful Illumina HiSeq X systems, Novogene is capable of sequencing up to 54,000 human genomes per year at the lowest cost per genome. Being one of the first few companies adopting HiSeq X Ten since early 2014, we have extensive experience providing whole genome sequencing service on this powerful system, having successfully sequenced tens of thousands of genomes with high quality results. With the throughput and capacity of the HiSeq X Ten, our deep experience with the system, and our advanced bioinformatics capabilities, Novogene is able to expertly meet customer needs for executing large projects with timely turn-around and the highest quality results.

“We did whole genome sequencing plus advanced bioinformatics analysis using Novogene earlier this year. I am very satisfied with their professional, skillful and high-quality services. In particular, the bioinformatics knowledge and expertise in the project team were very impressive. They provided great support, rapid turn-around, and pricing that enables us to do more science with our limited budget. I really appreciate their willingness and ability to explore more advanced bioinformatics analysis to meet the specific requirements of my projects. With these in mind, I have initiated two additional RNA-Seq projects with them and recommended two collaborators to use Novogene’s sequencing service.”

Wenhui Hu, M.D., Ph.D.
Associate Professor, Department of Neuroscience Temple University School of Medicine, Philadelphia, USA

“I am extremely satisfied with the quality and turn-around of the WGS results Novogene delivered. They have outstanding informatics/analysis, highly responsive and effective support, advanced Illumina technology (such as the X Ten), all at highly competitive prices.”

Justin Loe
CEO, Full Genomes Corporation, Maryland USA

The Novogene Advantage

  • State-of-the-art NGS technologies: Novogene is a world leader in sequencing capacity using state-of-the-art technology, including 2 sets of the latest generation Illumina HiSeq X Ten systems.
  • Highest data quality: We guarantee a Q30 score ≥ 80%, exceeding Illumina’s official guarantee of ≥75%. See our data example.
  • Extraordinary informatics expertise: Novogene uses its cutting-edge bioinformatics pipeline and internationally recognized best-in-class software to provide customers highly reliable “publication-ready data”.

Project Workflow

Human Whole Genome Sequencing Service Workflow

Sequencing Strategy

  • 350 bp insert DNA library
  • HiSeq X platform, paired-end 150 bp

Data Quality Guarantee

  • We guarantee that ≥ 80% of bases have a sequencing quality score ≥ Q30, which exceeds Illumina’s official guarantee of ≥ 75%.

Sample Requirements

  • Input DNA:
    • For fresh sample: ≥ 1.0 μg (a minimum of 200 ng can be accepted with risk)
    • For FFPE sample: ≥ 1.5 μg
  • DNA concentration: ≥ 20 ng/μl
  • DNA volume: ≥ 10 μl
  • Purity: OD260/280 = 1.8 - 2.0 without degradation or RNA contamination

Turnaround Time

  • 15 working days after verification of sample quality (without data analysis)
  • Additional 8 working days for data analysis

Recommended Sequencing Depth

  • For tumor tissues: 50×, adjacent normal tissues and blood 30×
  • For rare diseases: 30~50×

Analysis Pipeline

 Human Whole Genome Sequencing Analysis Pipeline

Bioinformatics Analysis includes:

  • Data quality control: filtering out reads containing adapters or with low quality
  • Alignment with reference genome, statistics of sequencing depth and coverage
  • SNP/InDel/SV/CNV calling, annotation and statistics
  • Somatic SNP/InDel/SV/CNV calling, annotation and statistics (paired tumor samples)

Advanced Analysis

Monogenic disorders

1. Variant filtering
2. Analysis under dominant/recessive model (Pedigree information is needed)
2.1 Analysis under dominant model
2.2 Analysis under recessive model
3. Functional annotation of candidate genes
4. Pathway enrichment analysis of candidate genes
5. Linkage analysis
6. Regions of homozygosity (ROH) analysis

Complex/multifactorial disorders

1. Variant filtering
2. Analysis under dominant/recessive model (Pedigree information is needed)
2.1 Analysis under dominant model
2.2 Analysis under recessive model
3. Functional annotation of candidate genes
4. Pathway enrichment analysis of candidate genes
5. De novo mutation analysis (Trio/Quartet)
5.1 De novo SNP/InDel detection
5.2 Calculation of de novo mutation rates
5.3 De novo CNV/SV and De novo SV/CNV detection
6. Protein-protein interaction (PPI) analysis
7. Association analysis of candidate genes (at least 20 trios or case/control pairs)

Cancer (for tumor-normal pair samples)

1. Screening for predisposing genes
2. Mutation spectrum & mutation signature analyses
3. Screening for known driver genes
4. Analyses of tumor significantly mutated genes
5. Analysis of copy number variations (CNV)
5.1. Analysis of CNV distribution
5.2.Analysis of CNV recurrence
6. Fusion gene detection (for WGS porject only)
7. Purity & ploidy analyses of tumor samples
8. Tumor heterogeneity analyses
9. Tumor evolution analysis
10. Display of genomic variants with Circos

Novogene provides the highest quality NGS services. We guarantee that over 80% of bases will have a sequencing quality score ≥Q30. In standard practice, Novogene achieves an average Q30 of 87.89% for WGS, exceeding Illumina’s official guarantee of 75%. Additionally, an average of 98.4% of our raw sequencing data passes the quality control standards for effective clean data.

The following table includes data from our whole genome sequencing service projects, and demonstrates the quality of our sequencing. Alignment of the results to the reference genome (UCSC hg19) showed an average mapping ratio of 99.44%.

Table. Representative data of Novogene's human whole genome sequencing service.
Example Whole Genome Sequencing (WGS) Data from Novogene

 1  Original sequencing data (in gigabases).
 2  Percentage of clean reads from all raw reads.
 3  Average error rate of all bases in read1 and read2.
 4  Percentage of reads with an average quality greater than Q20.
 5  Percentage of reads with an average quality greater than Q30.
 6  Percentage of G and C bases from total bases.
 7  Percentage of total reads that mapped to the reference genome (UCSC hg19).
 8  Average sequencing depth.
 9  Percentage of genome covered by sequencing.
10  Percentage of bases in genome with a sequencing depth ≥ 4x.
11  Percentage of bases in genome with a sequencing depth ≥ 10x.
12  Percentage of bases in genome with a sequencing depth ≥ 20x.

Project Example

The following studies utilized Novogene's expertise in whole genome sequencing on HiSeq X Ten.

Loss of BRCA1 or BRCA2 markedly increases the rate of base substitution mutagenesis and has distinct effects on genomic deletions
Oncogene, 36: 746-755 (2017)

Genome instability, caused by DNA repair failure and DNA response system damage, is a hallmark of cancer. BRCA1 and BRCA2 play important roles in homologous recombination repair. Here Novogene’s HiSeq X Ten was incorporated on chicken DT40 cell clones whole genome sequencing to compare the consequences of loss of BRCA1/2 on genomic mutagenesis, under normal cell culture or under a MMS treatment that designed to accelerate one class of endogenous mutagenic processes. Loss of BRCA1 or BRCA2 increased seven- to eightfold higher level of spontaneous base mutation rate, and this increased mutation is strongly correlated with a BRCA1/2 mutant signature. Loss of BRCA1 or BRCA2 also induced more insertion/deletion mutations and large rearrangements under the endogenous damage induction condition. The high rate of base substitution mutagenesis demonstrated in this study indicates significant oncogenic effect of the inactivation of BRCA1/2, with distinct roles of BRCA1/2 in the DNA lesions processing.

Figure. Number and spectrum of SNVs

Elimination of HIV-1 genomes from human T-lymphoid cells by CRISPR/Cas9 gene editing
Scientific Reports, 6:22555 (2016); DOI: 10.1038/srep22555.

The power of whole genome sequencing (WGS) as a companion technique to gene editing was demonstrated in this innovative study in which HIV-1 proviral DNA was excised from latently infected human T-cells with the CRISPR/Cas9 system. Using the state-of-the-art HiSeq X Ten platform, scientists at Novogene provided WGS and bioinformatics analysis at key stages in the study. WGS confirmed a known integration site of the HIV-1 genome in an infected T-cell line and revealed a second, previously unknown integration site. WGS also confirmed the complete removal of the HIV-1 genome from T-cells treated with a CRISPR/Cas9 system specifically targeted to viral DNA. Comparison of the genomes of treated and control cells revealed significant natural heterogeneity within both cell populations, but importantly, demonstrated that CRISPR/Cas9 did not generate off-target mutations, supporting the therapeutic potential of gene editing for treating latent HIV-1 infections.

whole genome sequencing results from customer publications

Figure. Whole-genome sequencing shows excision of the entire provirus of two copies of HIV-1 by Cas9/gRNAs and gRNAs A and B in human T cells. Integrative genomics view of the reads mapping against the HIV-1 genome (KM390026.1) called by BWA, revealed the presence of the HIV-1 proviral DNA sequence in the control cells with expression of Cas9 but not gRNAs (Panel A) but their complete absence in T-cells after expression of both Cas9 and gRNAs A and B (Panel B).

