Animal and Plant De novo Sequencing

Introduction to Animal and Plant De novo Sequencing

De novo sequencing generates an initial genomic sequence of a particular organism without a reference sequence. Through de novo sequencing, complex genomic variations such as Indel, CNV, and SV can be easily identified. It is also valuable in evolutionary and demographic history, agricultural breeding, and genetic variations calling.

With extensive experience in experimental operations and bioinformatics analyses, Novogene offers an accurate, rapid, and comprehensive characterization of species and generates reliable results. Furthermore, Novogene’s end-to-end services guarantee you ultra-fast turnaround time.

Applications of Animal and Plant De novo Sequencing

For individual research：

Guides animal health and genetic breeding
Provides a theoretical basis for drug screening
Explores medicinal resources and innovates varieties

For population research：

Explores species origin and evolution
Provides new insights into patterns of genome divergence

Benefits of Animal and Plant De novo Sequencing

Highly experienced: Novogene’s highly qualified researchers have completed major De novo genome sequencing projects and managed to publish their data in top-tier journals.
Bioinformatics expertise: Best-in-class and widely recognized software, such as Falcon and Canu, are being used for comprehensive plant and animal bioinformatic analyses.
Diverse strategies: By incorporating sequencing results from various platforms including Illumina Novaseq, PacBio Revio/Sequel II/Sequel IIe, and Oxford PromethION, we offer the best assembly solution specifically tailored for each unique genome.
Unsurpassed data quality: We guarantee a Q30 score ≥ 85%, exceeding Illumina’s official guarantee of ≥ 75%.

Animal and Plant De novo Seq Specifications:
DNA Sample Requirements

Platform Type	Sample Type	Amount (Qubit®)	Purity
Illumina NovaSeq X Plus / NovaSeq 6000	Genomic DNA	≥ 200 ng	OD260/280=1.8-2.0; no degradation, no contamination
	Genomic DNA (PCR free non-350bp)	≥ 3 μg
	Genomic DNA (PCR free 350bp)	≥ 1.1 μg
PacBio Sequel II DNA CLR library	HMW Genomic DNA	≥ 5 μg	OD260/280=1.75-2.0; OD260/230=1.5-2.6; NC/QC*=0.95-3.00 Fragments should be ≥ 30 kb
PacBio Revio/ Sequel II/sequel IIe DNA HiFi library	HMW Genomic DNA	≥ 5 μg	OD260/280=1.75-2.0; OD260/230=1.5-2.6; NC/QC*=1.00-2.20 Fragments should be ≥ 30 kb
Nanopore PromethION	HMW Genomic DNA	≥ 8 μg	OD260/280=1.75-2.0; OD260/230=1.4-2.6; NC/QC*=0.95~3.00 Fragments should be ≥ 30 kb

NC/QC:NanoDrop concetration/Qubit concentration

Animal and Plant De novo Seq Specifications:
Sequencing and Analysis

Sequencing Parameters	Illumina Novaseq 6000	PacBio Revio/sequel II/sequel IIe	Nanopore PromethION
Read Length	Paired-end 150 bp	N50>15 kb, long read lengths up to 25 kb（CCS）	average > 17 kb
Recommended Sequencing Depth	For genome survey or assembly polishing: ≥ 50×	For genome assembly: ≥ 50×
Standard Analysis	K-mer analysis GC content analysis Repeat content rate evaluation Heterozygous rate evaluation Genome size evaluation	Long-read assembly Assembly Statistics Gene completeness evaluation
Genome Annotation	－	Repeat prediction Structure prediction Function prediction Noncoding RNA prediction

Novogene Workflow of Animal and Plant De novo Service

From sample preparation library preparation, short and long-read sequencing, and data quality control, to bioinformatics analysis, Novogene provides high-quality products and professional services. Each step is performed in agreement with a high scientific standard and meticulous design to ensure high-quality research results.

Featured Publications of Animal and Plant De novo Sequencing

Telomere-to-telomere pear (Pyrus pyrifolia) reference genome reveals segmental and whole genome duplication driving genome evolution

Horticulture Research Date: November 2023IF: 8.7DOI: 10.1093/hr/uhad201
- Reference information
  
  Sun, M., Yao, C., Shu, Q., He, Y., Chen, G., Yang, G., Xu, S., Liu, Y., Xue, Z., & Wu, J. (2023). Telomere-to-telomere pear (Pyrus pyrifolia) reference genome reveals segmental and whole genome duplication driving genome evolution. Horticulture research, 10(11), uhad201.https://doi.org/10.1093/hr/uhad201
Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet

Nature Genetics Date: March 2023IF: 30.8DOI: 10.1038/s41588-023-01302-4
- Reference information
  
  Yan, H., Sun, M., Zhang, Z., Jin, Y., Zhang, A., Lin, C., Wu, B., He, M., Xu, B., Wang, J., Qin, P., Mendieta, J. P., Nie, G., Wang, J., Jones, C. S., Feng, G., Srivastava, R. K., Zhang, X., Bombarely, A., Luo, D., Jin, L., Peng, Y., Wang, X., Ji, Y., Tian, S., & Huang, L. (2023). Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet. Nature Genetics, 55(3), 507-518.https://doi.org/10.1038/s41588-023-01302-4
Evolution-guided multiomics provide insights into the strengthening of bioactive flavone biosynthesis in medicinal pummelo

Plant Biotechnology Journal Date: April 2023IF: 13.8DOI: 10.1111/pbi.14058
- Reference information
  
  Zheng, W., Zhang, W., Liu, D., Yin, M., Wang, X., Wang, S., Shen, S., Liu, S., Huang, Y., Li, X., Zhao, Q., Yan, L., Xu, Y., Yu, S., Hu, B., Yuan, T., Mei, Z., Guo, L., Luo, J., Deng, X., Xu, Q., Huang, L., & Ma, Z. (2023). Evolution-guided multiomics provide insights into the strengthening of bioactive flavone biosynthesis in medicinal pummelo. Plant Biotechnol J, 21(8), 1577-1589.
The genome of jojoba (Simmondsia chinensis): A taxonomically isolated species that direct waxester accumulation in its seeds

Science Advances Date: April 2020IF: 12.804DOI: https://www.science.org/doi/10.1126/sciadv.aay3240
- Reference information
  
  Sturtevant D, Lu S, Zhou ZW, Shen Y, Wang S, Song JM, Zhong J, Burks DJ, Yang ZQ, Yang QY, Cannon AE, Herrfurth C, Feussner I, Borisjuk L, Munz E, Verbeck GF, Wang X, Azad RK, Singleton B, Dyer JM, Chen LL, Chapman KD, Guo L. The genome of jojoba (Simmondsia chinensis): A taxonomically isolated species that direct wax ester accumulation in its seeds. Sci Adv. 2020 Mar 11;6(11):eaay3240. doi: 10.1126/sciadv.aay3240. PMID: 32195345; PMCID: PMC7065883.
Deciphering the High Quality Genome Sequence of Coriander that Causes Controversial Feeling

Plant Biotechnology JournalIssue Date:2019IF: 6.84DOI: https://onlinelibrary.wiley.com/doi/10.1111/pbi.13310
- Reference information
  
  Song X, Wang J, Li N, Yu J, Meng F, Wei C, Liu C, Chen W, Nie F, Zhang Z, Gong K, Li X, Hu J, Yang Q, Li Y, Li C, Feng S, Guo H, Yuan J, Pei Q, Yu T, Kang X, Zhao W, Lei T, Sun P, Wang L, Ge W, Guo D, Duan X, Shen S, Cui C, Yu Y, Xie Y, Zhang J, Hou Y, Wang J, Wang J, Li XQ, Paterson AH, Wang X. Deciphering the high-quality genome sequence of coriander that causes controversial feelings. Plant Biotechnol J. 2020 Jun;18(6):1444-1456. doi: 10.1111/pbi.13310. Epub 2020 Feb 5. PMID: 31799788; PMCID: PMC7206992.
The Reference Genome Sequence of Scutellaria baicalensis Provides Insights into the Evolution of Wogonin Biosynthesis

Molecular Plant Date: 2019IF: 9.326DOI: https://www.cell.com/molecular-plant/fulltext/S1674-2052(19)30131-5?
- Reference information
  
  Zhao Q, Yang J, Cui MY, Liu J, Fang Y, Yan M, Qiu W, Shang H, Xu Z, Yidiresi R, Weng JK, Pluskal T, Vigouroux M, Steuernagel B, Wei Y, Yang L, Hu Y, Chen XY, Martin C. The Reference Genome Sequence of Scutellaria baicalensis Provides Insights into the Evolution of Wogonin Biosynthesis. Mol Plant. 2019 Jul 1;12(7):935-950. doi: 10.1016/j.molp.2019.04.002. Epub 2019 Apr 15. PMID: 30999079.
Chromosome-level genome assembly of the razor clam Sinonovacula constricta (Lamarck, 1818)

Molecular Ecology ResourcesIssue Date: July 2019IF: 7.049DOI: https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13086
- Reference information
  
  Ran Z, Li Z, Yan X, Liao K, Kong F, Zhang L, Cao J, Zhou C, Zhu P, He S, Huang W, Xu J. Chromosome-level genome assembly of the razor clam Sinonovacula constricta (Lamarck, 1818). Mol Ecol Resour. 2019 Nov;19(6):1647-1658. doi: 10.1111/1755-0998.13086. PMID: 31483923.
Genome assembly provides insights into the genome evolution and flowering regulation of orchardgrass

Plant Biotechnology JournalIssue Date: 2019IF: 6.84DOI: https://onlinelibrary.wiley.com/doi/10.1111/pbi.13205
- Reference information
  
  Huang L, Feng G, Yan H, Zhang Z, Bushman BS, Wang J, Bombarely A, Li M, Yang Z, Nie G, Xie W, Xu L, Chen P, Zhao X, Jiang W, Zhang X. Genome assembly provides insights into the genome evolution and flowering regulation of orchardgrass. Plant Biotechnol J. 2020 Feb;18(2):373-388. doi: 10.1111/pbi.13205. Epub 2019 Jul 30. PMID: 31276273; PMCID: PMC6953241.

Long Read Sequencing

Assembly statistics

Grain Aphid genome A/T/G/C content statistics

Assembly evaluation-BUSCO assessment

BUSCO assessment results

Note:C：Complete BUSCOs; S：Complete and single-copy BUSCOs; D：Complete Duplicated BUSCOs; F：Fragmented BUSCOs; M：Missing BUSCOs; n：Total BUSCO groups searched

Assembly evaluation- CEGMA assessment

Sequencing depth distribution

Note:
X-axis: sequencing depth/X; y-axis, proportion of bases in the genome

GC content and depth distribution

Note:
X-axis: GC contents; y-axis: sequencing depth.
Upper: GC content distribution. Lower right: sequencing depth distribution.

Genome Annotation

Structure prediction

Augustus, GlimmerHMM, SNAP, Geneid and Genscan are used in De novo gene structure prediction.

Venn diagram of gene set evidence support

Function prediction

Protein sequences predicted by gene structure are aligned with known protein databases. Results suggest that the function of 95.8% of the genes could be predicted.

Venn diagram of gene function annotation

Short Read Sequencing

K-mer Analysis

Kmer=17analyses and genome size evaluation

Kmer	Depth	n_kmer	Genome_size(M)	Revised Genome_size(M)	Heterozygous_rate(%)	Repeat_rate(%)
17	67	203,660,880,738	3,039.71	3,020.12	0.46	60.41

Note:
(1)K-mer:Selected K-mer length.
(2)Depth:The expected value of K-mer depth.
(3)n_K-mer:The total number of K-mer from SOAPdenovo.
(4)Genome size(M):The genome size in Mb estimated by formula: Genome Size=K-mer_num/Peak_depth.
(5)Revise Genome size(M):Revised genome size after error correction from wrong K-mer.
(6)Heteozygous ratio:The percent of heteozygous positions.
(7)Repeat:Calculated by the percentage of K-mer numbers after 1.8-fold of the main peak of total K-mer numbers.
Note: The repeat here is a mathematically repeated sequence but not a repeat element with certain biological functions.

Distribution of K-mer number/type frequency and depth

Note：
X-coordinate is K-mer depth. Y-coordinate is the frequency of each K-mer depth.

*Please contact us to get the full demo report.