Overview

Human whole genome sequencing (hWGS) enables researchers to catalog a genetic constitution of individuals and capture all variants (single-nucleotide variations (SNVs), insertions and deletions (InDels), copy number variations (CNVs), and large structural variants (SV)) present in a single assay. Equipped with the powerful Illumina NovaSeq 6000 system, Novogene is capable of sequencing up to 280,000 human genomes per year at the lowest cost per genome. With the addition of Oxford Nanopore PromethION and PacBio Sequel Systems, Novogene also provides hWGS services with more complete and accurate characterization of human genome and complements missing sequencing reads, especially in highly polymorphic and highly repetitive regions from short reads sequencing. With extensive experience in whole genome sequencing and advanced bioinformatics capabilities, Novogene is able to expertly meet customer needs for delivering large project results with quick turnaround times and the highest quality results.
Service SpecificationsApplications
- Genetic disease study
- Cancer research
- Human population evolution
- DNA biomarkers
- Pharmacogenomics
Advantages
- State-of-the-art NGS technologies: Novogene is a world leader in sequencing capacity using state-of-the-art technology, including Illumina HiSeq and NovaSeq 6000 Systems.
- Highest data quality: We guarantee a Q30 score ≥ 80%, exceeding Illumina’s official guarantee of ≥ 75%. See our data example.
- Extraordinary informatics expertise: Novogene uses its cutting-edge bioinformatics pipeline and internationally recognized, best-in-class software to provide customers with highly reliable, publication-ready data.
Sample Requirements
Platform Type | Sample Type | Amount (Qubit®) | Purity |
Illumina Novaseq 6000
|
Genomic DNA | ≥ 200 ng |
OD260/280=1.8-2.0
|
Genomic DNA (PCR free) | ≥ 1.5 μg | ||
Genomic DNA from FFPE | ≥ 0.8 μg | ||
PacBio Sequel I/II | HMW Genomic DNA | ≥ 10 μg (for Sequel I) ≥30 μg (for Sequel II) |
OD260/280=1.8-2.0; OD260/230=2.0-2.2; Fragments should be ≥ 30 Kb for Sequel I, ≥ 60 Kb for Sequel II |
Nanopore PromethION | HMW Genomic DNA | ≥ 10 μg | OD260/280=1.8-2.0; OD260/230=2.0-2.2; Fragments should be ≥ 30 Kb |
Sequencing Parameters and Analysis Contents
Platform Type | Illumina Novaseq 6000 | PacBio Sequel I/II | Nanopore PromethION |
Read Length | Paired-end 150 bp | average > 10 Kb for Sequel I average > 15 Kb for Sequel II |
average > 17 Kb |
Recommended Sequencing Depth
|
For rare diseases: 30-50× | For genetic diseases: 10-20× | For genetic diseases: 10-20× |
For tumor tissues: 50×, adjacent normal tissues and blood 30× | For tumor tissues: ≥20× | For tumor tissues: ≥20× | |
Standard Data Analysis
|
Data quality control | Data quality control | |
Alignment with reference genome | Sequence alignment | ||
SNP/InDel/SV/CNV detection | Structural variant (SV) detection | ||
Somatic SNP/InDel/SV/CNV detection (tumor-normal paired samples) | Variation annotation |
Note: For detailed information, please refer to the Service Specifications and contact us for customized requests.
Project Workflow


Sampling & Sequencing Strategy:
Sampling:
• 182 Chinese primary HCC samples
Sequencing Strategy:
• Human whole genome sequencing (49 cases), whole exome sequencing (18 cases), and targeted region sequencing (115 cases) on Illumina platforms (PE150)
Results & Conclusion
By using WGS, this study described the genomic landscape, including somatic SNVs/InDels, CNVs, and SVs, and identified five prominent mutational signatures in 49 Chinese patients with HCC (Figure 3). Through WGS, WES, and targeted sequencing of 182 primary HCC samples, the results suggest that WNK2, RUNX1T1, CTNNB1, TSC1, and TP53 may play roles in HCC invasion and metastasis, and that WNK2 had the most significant difference in mutation frequency (Figure 4). Biofunctional investigations revealed a tumor-suppressor role for WNK2; its inactivation led to ERK1/2 signaling activation in HCC cells, tumor-associated macrophage infiltration, and tumor growth and metastasis. This study describes the genomic events that characterize Chinese HCCs and identify WNK2 as a driver of HCC that was associated with early tumor recurrence after curative resection.
Reference: Zhou SL, Zhou ZJ, Hu ZQ, et al. Genomic Sequencing Identifies WNK2 as a Driver in Hepatocellular Carcinoma and a Risk Factor for Early Recurrence[J]. Journal of Hepatology 2019, doi: 10.1016/j.jhep.2019.07.014.
Characteristics of genomic alterations of lung adenocarcinoma in young never-smokers (Luo et al., 2018)
Background:
Non-small-cell lung cancer (NSCLC) has been recognized as a highly heterogeneous disease with phenotypic and genotypic diversity in each subgroup. While never-smoker patients with NSCLC have been well studied through next generation sequencing, the potentially unique molecular features of young never-smoker patients with NSCLC remains largely unknown.
Sampling & Sequencing Strategy:
Sampling:
• 36 never-smoker patients with lung adenocarcinoma (LUAD)
Sequencing Strategy:
• Human whole genome sequencing on Illumina platform (PE150)
Results & Conclusion
The study revealed that besides the well-known gene mutations, several potential lung cancer-associated gene mutations that were rarely reported (e.g., HOXA4 and MST1) were identified. The lung cancer-related copy number variations (e.g., EGFR and CDKN2A) were enriched and the lung cancer-related structural variations (e.g., EML4-ALK and KIF5B-RET) were commonly observed. Notably, new fusion partners of ALK (SMG6-ALK) and RET (JMJD1C-RET) were found. Furthermore, a high prevalence of potentially targetable genomic alterations was observed in the cohort. Finally, the research identified germline mutations in BPIFB1, CHD4, PARP1, NUDT1, RAD52, and MFI2 were significantly enriched in the young never-smoker patients with LUAD comparing with the in-house noncancer database (p<0.05). This study provides a detailed mutational portrait of LUAD occurring in young never-smokers and gives insights into the molecular pathogenesis of this distinct subgroup of NSCLC.
Reference: Luo WX, Tian PW, Wang Y, et al. Characteristics of genomic alterations of lung adenocarcinoma in young never-smokers[J]. International Journal of Cancer, 2018, 143, 1696‒1705.
Genetic alterations in esophageal tissues from squamous dysplasia to carcinoma (Liu et al., 2017)
Background:
Esophageal squamous cell carcinoma (ESCC) is the most common subtype of esophageal cancer. Little is known about the genetic changes that occur in esophageal cells during the development of ESCC. This study performed next-generation sequence analyses of esophageal nontumor, intraepithelial neoplasia (IEN), and ESCC tissues from the same patients to track genetic changes during tumor development.
Sampling & Sequencing Strategy:
Sampling:
• 227 esophageal tissue samples from 70 patients with ESCC undergoing resection
Sequencing Strategy:
• Human whole genome sequencing (7 cases), whole exome sequencing (18 cases), and targeted region sequencing (45 cases) on Illumina platforms (PE150)
Results & Conclusion
The study revealed significant similarities in the types and frequency of mutations between IEN and ESCC (Figure 1), including similarity in the DNA damage mutation signature. Mutations in the CCND1, CDKN2A, and FGFR1 genes were also revealed as the early driver events from phylogenetic and clonal analysis. However, the number of non-overlapping SNVs in tissues taken from the same individuals indicated that various lesions formed independently and that there was independent clonal expansion of mutations. As shown in this study, using multiple NGS applications provides novel approaches for exploring early diagnostics and treatments for cancer.
Reference: Liu X, Zhang M, Ying SM, et al. Genetic alterations in esophageal tissues from squamous dysplasia to carcinoma[J]. Gastroenterology, 2017, 153: 166‒177.
Error Rate Distribution of hWGS Sequencing Results
GC Content Distribution
Sequencing Depth & Coverage Distribution
SNP Detection
Sample | Sample_1 | Sample_2 | Sample_3 | Sample_4 | Sample_5 | Sample_6 |
CDS | 22318 | 22343 | 22271 | 22702 | 22654 | 22418 |
Synonymous SNP | 11342 | 11375 | 11329 | 11,439.00 | 11387 | 11376 |
missense SNP | 10335 | 10340 | 10334 | 10643 | 10649 | 10400 |
stopgain | 77 | 81 | 72 | 87 | 87 | 8.30 |
stoploss | 14 | 13 | 11 | 12 | 12 | 10 |
unknown | 558 | 541 | 536 | 531 | 528 | 501 |
intronic | 1263778 | 1261992 | 1262435 | 1259099 | 1262095 | 1271575 |
UTR3 | 25167 | 25134 | 25496 | 25396 | 25462 | 25510 |
UTR5 | 5568 | 5562 | 5644 | 5767 | 5829 | 5702 |
splicing | 84 | 85 | 84 | 86 | 90 | 96 |
ncRNA exonic | 11867 | 11818 | 11734 | 11628 | 11697 | 11760 |
ncRNA intronic | 205360 | 205028 | 200363 | 199813 | 200397 | 205018 |
ncRNA splicing | 66 | 66 | 58 | 61 | 64 | 60 |
upstream | 22383 | 22339 | 22230 | 22648 | 22744 | 22708 |
downstream | 23565 | 23544 | 23515 | 23221 | 23235 | 23557 |
intergenic | 2119447 | 2115048 | 2110391 | 2091107 | 2098406 | 2138433 |
Total | 3700477 | 3693838 | 3685038 | 3662384 | 3673519 | 3727684 |
Circos Diagram
Novogene shows Circos only when CNV analysis was carried out. The figure consists seven rings from outer to inner.
(1) The outer circle (the first circle) is chrome information.
(2) The second ring represents the read coverage in histogram style. A histogram is the average coverage of a 0.5Mbp region.
(3) The third ring represents indel density in scatter style. A black dot is calculated as indel number in a range of 1Mbp.
(4) The fourth ring represents snp density in scatter style. A green dot is calculated as snp number in a range of 1Mbp.
(5) The fifth ring represents the proportion of homozygous SNP (orange) and heterozygous SNP (grey) in histogram style. A histogram is calculated from a 1Mbp region.
(6) The sixth ring represents the CNV inference. Red means gain, and green means loss.
(7) The most central ring represents the SV inference in exonic and splicing regions. TRA (orange), INS (green), DEL (grey), DUP (pink) and INV (blue).
Heatmap of Significantly Mutated Genes
Linkage Analysis
