Data set: The structure of the tetraploid sour cherry 'Schattenmorelle' (Prunus cerasus L.) genome reveals insights into its segmental allopolyploid nature
Sour cherry (Prunus cerasus) is an important fruit crop that is widely cultivated within the northern hemisphere. Its fruits are rich in nutrients and bioactive compounds that have been linked to numerous health benefits, including anti-inflammatory, antioxidant, and anti-cancer properties. The demand for sour cherry has been increasing due to its use in the food, pharmaceutical, and cosmetic industries. Breeding of sour cherry has been a major focus of research and development in recent years, aiming to improve yield, fruit quality, and disease resistance. One of the main challenges in sour cherry breeding is the genetic complexity of the species, which makes it difficult to identify and select desirable traits.
However, advances in molecular biology and genomics have provided new tools and resources for sour cherry breeders, enabling them to identify and characterize important genes and markers associated with desirable traits. Rapid progress in third-generation sequencing technologies enables breeders to explore and exploit the genomic information underlying economical important traits. We used Oxford Nanopore Technology PromethION platform and R9.4.1 pore type and Illumina NovaSeqTM to sequence the genome of tetraploid sour cherry (Prunus cerasus). The genome and annotation visualization is permanently hosted at an assembly hub http://bioinf.uni-greifswald.de/private-hubs/pcer/hub.txt.
The data on this platform includes 28 files, supporting the assembling, annotation procedure, and additional data used to characterize the segmental allopolyploid structure of the genome.
As a result, a genome sequence has been processed (20_WGS_PCE_2_0.fa).
The final genome sequence was annotated using Braker 1 and Braker 2 and the GeMoMa pipeline and 14 reference datasets from other Prunus species (Prunus_cerasus_families_fa.gz, genome_chr_fa_masked.gz, rm1_genome_chr_fa_out.gz, rm1_genome_chr_fa_tbl.gz, rm2_genome_chr_fa_out.gz, rm2_genome_chr_fa_tbl.gz, tsebra_braker1_2_gmst_v4_aa.gz, tsebra_braker1_2_gmst_v4_codingseq.gz, tsebra_braker1_2_gmst_v4-gtf.gz, final_annotation_1.gff).
The final structural annotation and protein predictions are provided for future studies (final_annotation_proteins.fasta).
The functional prediction was performed with InterProScan (Results IPS_PCE_A.xlsx, Results IPS_PCE_F.xlsx).
In addition, the sequences and annotation of the chloroplast (PCE_chloroplast_1_0.fa, GeSeqJob_20220901_192228_PCE_chloroplast_GFF3.gff3) and mitochondrion (PCE_mitochondria_1_0.fa, GeSeqJob_20230103_105327_PCE_mitochondria_GFF3.gff3) are presented.
Moreover, we used percentage of base coverage from rna-seq reads (RNAseq_avium.tsv, RNAseq_fruticosa.tsv), genomic reads (percentages_125.tsv) and % IAA (best_iAA_all_transcripts_PA_PF_only.xlsx, suppl_data_TabS5.csv) to evaluate the structure of the genome sequence on the occurrence of homoeologous exchanges.
We also determined the position of molecular markers (suppl_data_FigS5.xlsx) and LTRs (From_PCE_Prunus_avium_fna_mod_pass.list, From_PCE_Prunus_fruticosa_fna_mod_pass.list) on both subgenomes. Furthermore, the position of LTRs was determined on the genomes of P. avium (Prunus_avium_NCBI_fna_mod_pass.list) and P. fruticosa (Prunus_fruticosa_NCBI_fna_mod_pass.list).
Cite
Access Statistic

Rights
Use and reproduction:
PDDL - Public Domain Dedication and License