What is YASS ?
YASS is a genomic similarity search tool, for nucleic (DNA/RNA) sequences in fasta or plain text format (it produces local pairwise alignments). Like most of the heuristic pairwise local alignment tools for DNA sequences (FASTA, BLAST, PATTERNHUNTER, BLASTZ/LASTZ, LAST ...), YASS uses seeds to detect potential similarity regions, and then tries to extend them to local alignments. This genomic search tool uses multiple transition constrained spaced seeds that enable to search more fuzzy repeats, as non-coding DNA/RNA. Another simple, but interesting feature is that you can specify the seed pattern used in the search step (as provided for example by iedera).
- multiple, possibly overlapping seeds and a new hit criterion to ensure a good sensitivity/selectivity trade-off
- transition-constrained spaced seeds to improve
sensitivity (transition mutations are purine to purine [
A
↔G
] or pyrimidine to pyrimidine [C
↔T
]) - using different scoring schemes with bit-score and E-value evaluated according to the sequence background frequencies
- parameterizable output filter for low complexity repeats
- reporting of various alignment statistical parameters (mutation bias along triplets, transition/transversion)
- post-processing step to group gapped alignments
YASS produces the common blast tabular output files ("-d 3" option). Moreover, a small perl script is provided to transform the original extended output ("-d 1" option) into blast output with alignments (which can be directly parsed with Bioperl), axt files, or fasta alignments : yass2blast.pl
If you use this program in your work, please cite :
[1] L. Noe, G. Kucherov YASS: enhancing the sensitivity of DNA similarity search, 2005, Nucleic Acids Research, 33(2):W540-W543.
Of course, your are welcome and always appreciated!
Query YASS Online
You can query YASS from this server: yass.php.
The web interface enables you to run the YASS pairwise alignment tool online, on your DNA sequences, and visualise the pairwise local alignments produced online:
- paste or upload your genomic sequences, or use the proposed ones,
- control several of the YASS parameters, from selecting the E-value or refining the scoring system, to selecting the (transition) spaced seed model,
- produce a dot-plot view of the alignments / a tabular view of the complete output,
- download the result as a yass/blast/axt/fasta output file,
- run an annotation Blast, a multiple alignment Clustalw of Muscle, or Mfold, on a simple click.
Publications
A non-exhaustive list of papers that use YASS ... I also try to maintain a (try to be) exhaustive list of "spaced seeds" related papers for those interested in this subject. Please let me know if I forget someone ...
- Greedily assemble tandem repeats for next generation sequences - YASS was used to realign tandem repeats of the human chromosome 14 from the tandem repeat database (TRDB) to the human reference.
- Complete genome and bimodal genomic structure of the amoebal symbiont Neochlamydia strain S13 revealed by ultra-long reads obtained from MinION - selected reads were aligned with YASS.
- A Schwann cell–enriched circular RNA circ-Ankib1 regulates Schwann cell proliferation following peripheral nerve injury - YASS was used to analyse the sequences of intron 1 and intron 5 of gene Ankib1 (RGSC 5.0/rn5).
- Rapid evolution of α-gliadin gene family revealed by analyzing Gli-2 locus regions of wild emmer wheat - dotplots of regions of the wild emmer and hexaploid wheat Chinese Spring were performed using YASS.
- A key metabolic gene for recurrent freshwater colonization and radiation in fishes - Local genomic similarity of the Fads2 regions between LG12 and LG19 was analyzed using YASS.
- Loss of DPP6 in neurodegenerative dementia: a genetic player in the dysfunction of neuronal excitability - DNA local alignment analysis of the NCBI hg19 reference sequence of chromosome 7 was performed using YASS.
- Polyunsaturated fatty acid production by Yarrowia lipolytica employing designed myxobacterial PUFA synthases - Polyadenylation signals were avoided as well as long sequence repeats with YASS.
- Reprogramming of Retrotransposon Activity during Speciation of the Genus Citrus - dot plot was performed using YASS.
- The first next-generation sequencing approach to the mitochondrial phylogeny of African monogenean parasites (Platyhelminthes: Gyrodactylidae and Dactylogyridae) - repeat regions were checked with Tandem Repeats Finder and YASS.
- Expansion and Functional Diversification of SKP1-Like Genes in Wheat (Triticum aestivum L.) - YASS was used to produce dot-plot of Figure 3.
- The conservation landscape of the human ribosomal RNA gene repeats - Other sequence elements in the IGS were identified using YASS and BLAST.
- npInv: accurate detection and genotyping of inversions using long read sub-alignment - YASS dotplot was used to detect NAHR inversion.
- De novo assembly of a young Drosophila Y chromosome using single-molecule sequencing and chromatin conformation capture - dot plots were produced using mummerplot, symap42, or YASS.
- Natural History of a Satellite DNA Family: From the Ancestral Genome Component to Species-Specific Sequences, Concerted and Non-Concerted Evolution - YASS was used to produce dot-plot of Figure 2.
- Expanding an expanded genome: long-read sequencing of Trypanosoma cruzi - YASS dotplot was used to compare contigs.
- Can Interspecies Hybrid Zygosaccharomyces rouxii Produce an Allohaploid Gamete? - YASS dotplot was used to examined the genome-wide synteny and identity between C. versatilis and Z. rouxii.
- Mapping of Hieracium (Asteraceae) chromosomes with genus‑specific satDNA elements derived from next‑generation sequencing data - YASS dotplot was used to seach for more fuzzy repeats for potential tandem organization.
- The comparison of four mitochondrial genomes reveals cytoplasmic male sterility candidate genes in cotton - YASS was used to analyse mitogenomes.
- Anisogamy evolved with a reduced sex-determining region in volvocine green algae - YASS dotplot was used to examine scaffolds between haplotypes to detect the rearranged genomic regions of MT.
- Clustering of circular consensus sequences:accurate error correction and assembly of single molecule real-time reads from multiplexed amplicon libraries - YASS was used to generate dot plots for alignments between the consensus sequences formed by LAA and their expected amplicon sequence.
- Identification of novel polymorphisms and two distinct haplotype structures in dog leukocyte antigen class I genes: DLA-88, DLA-12 and DLA-64 - YASS dotplot was used to align DLA-class 1 segments.
- Characterization of the ICCE Repeat in Mammals Reveals an Evolutionary Relationship with the DXZ4 Macrosatellite through Conserved CTCF Binding Motifs - YASS was used to generate pairwise alignments of the ICCE Repeat.
- Assembly of Schizosaccharomyces cryophilus chromosomes and their comparative genomic analyses revealed principles of genome evolution of the haploid fission yeasts - YASS and Mauve were used to identify chromosomal breakpoints.
- Reassessment of the evolution of wheat chromosomes 4A, 5A, and 7B - YASS was used to search for satellite DNA in breakpoint regions.
- The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization - YASS was used to identify internal tandem direct repeats.
- The first next-generation sequencing approach to the mitochondrial phylogeny of African monogenean parasites - YASS was used to search for repeat regions.
- Analysis of human ES cell differentiation establishes that the dominant isoforms of the lncRNAs RMST and FIRRE are circular - YASS was used to perform Dot matrix analyses.
- Draft chloroplast genome of Larix gmelinii var. japonica: insight into intraspecific divergence - YASS dotplot was used to validate the structure of L. gmelinii and L. decidua cp genomes.
- Whole-Genome Sequence of the Fruiting Myxobacterium Cystobacter fuscus DSM 52655 - YASS dotplot was used to inspect the final contig.
- Whole-genome assembly of Babesia ovata and comparative genomics between closely related pathogens - YASS dotplot was used to align the reordered contigs and the B. bigemina genome.
- Structure and diversity of the rhesus macaque immunoglobulin loci through multiple de novo genome assemblies - YASS dotplot was used to analyse diversity.
- Progress in identifying epigenetic mechanisms of xenobiotic-induced non-genotoxic carcinogenesis - YASS dotplot was used to align local genomic regions between orthologous mouse and human.
- A Pectin Methylesterase ZmPme3 Is Expressed in Gametophyte factor1-s (Ga1-s) Silks and Maps to that Locus in Maize (Zea mays L.) - YASS was used to analyse the structure of the W22 PME repeat region.
- Approaches for in silico finishing of microbial genome sequences - YASS was cited in Gap closing and Assembly evaluation.
- The plastid genome in Cladophorales green algae is encoded by hairpin chromosomes - YASS was used to generate dotplots for all contigs.
- Complete Genome Sequence of the Fruiting Myxobacterium Melittangium boletus DSM 14713 - YASS was used to inspect contigs.
- Evolution of sequence-specific anti-silencing systems in Arabidopsis - YASS dotplot was used to plot VANC genes.
- The unusual S locus of Leavenworthia is composed of two sets of paralogous loci - YASS was used to identify distant similarities with short sequences.
- Reactivation of hyperglycemia-induced hypocretin (HCRT) gene silencing by N-acetyl-d-mannosamine in the orexin neurons derived from human iPS cells - dot-plots for the HCRT gene locus were generated with YASS.
- Divergence of annual and perennial species in the Brassicaceae and the contribution of cis-acting variation at FLC orthologues - YASS was used to align AmFLC genomic region and AmFLC cDNA.
- Concerted evolution rapidly eliminates sequence variation in rDNA coding regions but not in intergenic spacers in Nicotiana tabacum allotetraploid - YASS (dotplot) was used to provide “self-to-self” comparisons of IGS consensus sequences.
- Wild tobacco genomes reveal the evolution of nicotine biosynthesis - YASS was used to search for Transposable Elements.
- Identification of novel polymorphisms and two distinct haplotype structures in dog leukocyte antigen class I genes:DLA-88, DLA-12 and DLA-64 - Dot-plots of DLA sequences were done with YASS.
- Higher-order organisation of extremely amplified, potentially functional and massively methylated 5S rDNA in European pikes (Esox sp.) - YASS was used to align reads to the 5S rDNA unit NGS consensus.
- Progress in Identifying Epigenetic Mechanisms of Xenobiotic-Induced Non-Genotoxic Carcinogenesis - YASS dotplot was used to align orthologous regions of human and mouse genomes.
- Draft chloroplast genome of Larix gmelinii var. japonica: insight into intraspecific divergence - YASS dotplot was used to validate the structure of L. gmelinii and L. decidua genomes.
- DHX9 suppresses RNA processing defects originating from the Alu invasion of the human genome - YASS was used to look for Alu elements that can base pair with each other.
- TRA: tandem repeat assembler for next generation sequences - YASS was used to align tandem repeats from TRDB to the human genome.
- Ablation of RNA interference and retrotransposons accompany acquisition and evolution of transposases to heterochromatin protein CENPB - YASS was used to analyse codon usage bias.
- Establishment of DNA methylation patterns of the Fibrillin1 (FBN1) gene in porcine embryos and tissues - YASS was used to align FBN1 CpG islands among pig, human, mouse.
- Gene conversion events and variable degree of homogenization of rDNA loci in cultivars of Brassica napus - YASS was used to analyse subrepeats
- Short tandem repeats, segmental duplications, gene deletion, and genomic instability in a rapidly diversified immune gene family - YASS was used to compare five genes plus the intergenic sequences and flanking regions.
- A portrait of ribosomal DNA contacts with Hi-C reveals 5S and 45S rDNA anchoring points in the folded human genome - YASS was used to detect structural similarities among the contigs.
- Genetic basis for high population diversity in Protea-associated Knoxdaviesia - YASS dot-plot was used to compare idiomorphs (dissimilar alleles).
- Exaptation of Bornavirus-Like Nucleoprotein Elements in Afrotherians - YASS dot-plot was used to perform analysis of the genes of interest.
- Gene conversion events and variable degree of homogenization of rDNA loci in cultivars of Brassica napus - Subrepeats were analysed using YASS and Tandem Repeats Finder.
- IncA/C Conjugative Plasmids Mobilize a New Family of Multidrug Resistance Islands in Clinical Vibrio cholerae Non-O1/Non-O139 Isolates from Haiti - YASS was used to analyse MGIVchHai6, a mobilizable genomic island (MGI).
- Enrichment by hybridisation of long DNA fragments for Nanopore sequencing - YASS was used to align HCMV nanopore reads to the HCMV HHV-5 reference.
- Mitochondrial Genome Analysis of Wild Rice (Oryza minuta) and Its Comparison with Other Related Species - YASS dot-plot was used to analyse similarities.
- Evidence for the sexual origin of heterokaryosis in arbuscular mycorrhizal fungi - YASS was used to align scaffolds.
- The mitochondrial genome of the egg-laying flatworm Aglaiogyrodactylus forficulatus (Platyhelminthes: Monogenoidea) - YASS dot-plot was used to detect repeat regions.
- Discovery and profiling of small RNAs responsive to stress conditions in the plant pathogen Pectobacterium atrosepticum - YASS was used to align sRNA candidate signatures.
- Decoding the oak genome: public release of sequence data, assembly, annotation and publication strategies - YASS was used to detect overlapping fragments.
- Identification of genetic and environmental factors stimulating excision from Streptomyces scabiei chromosome of the toxicogenic region responsible for pathogenicity - YASS was used to find candidate attachment sites : searching for short repeated sequences.
- Draft genome of a commonly misdiagnosed multidrug resistant pathogen Candida auris - YASS was used to generate dot-plots.
- An epigenetic regulatory element of the Nodal gene in the mouse and human genomes - YASS was used to generate dot-plots.
- YOC, A new strategy for pairwise alignment of collinear genomes - YASS was used in "Phase 1: Similarity detection".
- GMcloser: closing gaps in assemblies accurately with a likelihood-based selection of contig or long-read alignments - YASS was used to generate pairwise alignment between the subcontigs of a gap.
- Phylogenetic analysis of HSP70 gene of Aspergillus fumigatus reveals conservation intra-species and divergence inter-species - YASS was used to generate dot-plots and alignments.
- Dominance hierarchy arising from the evolution of a complex small RNA regulatory network - Some precursor motifs within S-alleles were searched with YASS.
- The Rsm regulon of plant growth-promoting Pseudomonas fluorescens SS101: role of small RNAs in regulation of lipopeptide biosynthesis - sRNA searches were performed by Blast and YASS against the Rfam database (RNAspace package).
- An epigenetic regulatory element of the Nodal gene in the mouse and human genomes - Dot plots were generated with YASS.
- Expression of a transferred nuclear gene in a mitochondrial genome - YASS was used to align TAIR sequences against A. thaliana mitochondrial genome.
- Head formation: OTX2 regulates Dkk1 and Lhx1 activity in the anterior mesendoderm - "Mouse and human genomic sequence were analysed with the University of California, Santa Cruz (UCSC) genomic browser and YASS".
- Genome of brown tide virus (AaV), the little giant of the Megaviridae, elucidates NCLDV genome expansion and host–virus coevolution - "Whole genome dot plot was constructed in YASS (Yet Another Similarity Searcher) server".
- Subcellular localization of minicircle DNA in the dinoflagellate Amphidinium massartii - YASS was used to align the non-coding region of the Dinoflagellate minicircle DNA.
- A region of euchromatin coincides with an extensive tandem repeat on the mouse (Mus musculus) inactive X chromosome - Dot plots of an extensive tandem repeat (on mouse chromosome X) were generated with YASS.
- A gene expression restriction network mediated by sense and antisense Alu sequences located on protein-coding messenger RNAs - the command-line version of YASS was used to search Alu elements in PCM1 against RefSeq.
- Dancing together and separate again: gymnosperms exhibit frequent changes of fundamental 5S and 35S rRNA gene (rDNA) organisation - YASS was used to identify rRNA coding sequences using dotplots.
- Bioinformatics Tools for the Multilocus Phylogenetic Analysis of Fungi - YASS is listed as a tool used for multilocus phylogenetics analysis in fungi.
- Arabidopsis thaliana Resistance to Fusarium Oxysporum 2 Implicates Tyrosine-Sulfated Peptide Signaling in Susceptibility and Resistance to Root Infection - Initial alignments of intergenic regions in Col-0 and Ty-0 sequences were generated with YASS.
- RNA at 92 °C: The non-coding transcriptome of the hyperthermophilic archaeon Pyrococcus abyssi - YASS was used in one filter for potential RNA targets.
- The tRNAarg Gene and engA Are Essential Genes on the 1.7-Mb pSymB Megaplasmid of Sinorhizobium meliloti and Were Translocated Together from the Chromosome in an Ancestral Strain - Dot plots were generated with YASS.
- Evolution of the Chloroplast Genome in Photosynthetic Euglenoids: A Comparison of Eutreptia viridis and Euglena gracilis (Euglenophyta) - YASS was used to produce the dot plot between the two chloroplast genomes.
- The macrosatellite DXZ4 mediates CTCF-dependent long-range intrachromosomal interactions on the human inactive X chromosome - Pairwise regions of Dxz4 were aligned using YASS.
- The mouse DXZ4 homolog retains Ctcf binding and proximity to Pls3 despite substantial organizational differences compared to the primate macrosatellite - Alignments and Dot plots were generated with YASS.
- Ortholog Alleles at Xa3/Xa26 Locus Confer Conserved Race-Specific Resistance against Xanthomonas oryzae in Rice - YASS was used to produce the Pair-wise sequence comparisons of Xa3/Xa26.
- RNAspace.org: An integrated environment for the prediction, annotation, and analysis of ncRNA - YASS is used to produce alignments of RNAspace.
- Uncovering the Prevalence and Diversity of Integrating Conjugative Elements in Actinobacteria - YASS was used to identify the direct repeats of the attL and attR attachment sites.
- Detection of small RNAs in Bordetella pertussis and identification of a novel repeated genetic element - YASS was used to indentify Bordetella pertussis sRNAs.
- The human gut virome: Inter-individual variation and dynamic response to diet - YASS was used to assess similarity between VLP contigs.
- Homoplastic microinversions and the avian tree of life - YASS was used together with Blast to detect microinversions.
- An alternative approach to multiple genome comparison - YASS is used (together with other tools) to compute initial pairwise alignments.
- Structural and content diversity of mitochondrial genome in beet: a comparative genomic analysis - YASS was used for various analyses (chimeric ORFs detection, duplication detection, ...).
- Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes - YASS was used together with Blast to predict dif/XerCD systems.
- Cis-regulatory characterization of sequence conservation surrounding the Hox4 genes - YASS was used to determine CNEs alignments in zebrafish.
- A scenario of mitochondrial genome evolution in maize based on rearrangement events - YASS was used to compute markers.
- Sequence analysis of two alleles reveals that intra- and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo) - YASS was used to align the two alleles sequences.
- dif/Xer Recombination Systems in Proteobacteria - YASS was used to identify dif-related sequences in proteobacteria.
- Evolutionary dynamics of a locus of fertility restoration in plants - YASS was used to analyse the similarity between alleles and inside the studied allele.
- Mapping of 5q35 chromosomal rearrangements within a genomically unstable region - YASS was used to search for similarities, and YASS dotplot provided positional information.
- The genome sequence of the model ascomycete fungus Podospora anserina - YASS was used together with Erpin and Blast to detect non-coding RNAs.
- Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula - YASS was used to align Transposable elements.
- Mammalian Small Nucleolar RNAs Are Mobile Genetic Elements - YASS was used to align genomic segments.
- Genomic architecture of human 17q21 linked to frontotemporal dementia uncovers a highly homologous family of low-copy repeats in the tau region - YASS was used to analyse contig sequences for LCR regions.
- sno-RNA database - YASS was used to produce human and yeast rDNAs alignments.
- A fast and flexible approach to oligonucleotide probe design for genomes and gene families - Transition constrained seeds were used to filter potential oligonucleotides.
- SinicView : multiple alignment comparisons - famous YASS matrix used. ;-)
- Seeds for effective oligonucleotide design - Iedera was used to design spaced seeds.
- Adaptive seeds tame genomic sequence comparison - LAST software and adaptative seeds.
- Spaced seeds for cross-species CDNA-to-genome sequence alignment - review spaced seed design and focus on cross-species sequence alignment.
- A Genomic Distance Based on MUM Indicates Discontinuity between Most Bacterial Species and Genera - maximal unique matches (MUM) based distance for full genomes.
- A survey of seeding for sequence alignment - a survey of the different approaches proposed to index genomic or amino-acid sequences.
- Run Probability of High-Order Seed Patterns and its Applications to Finding Good Transition Seeds - a new heuritic is proposed to quickly select good (either optimal) transition seeds.
- Good spaced seeds for homology search - this paper provides a list of single spaced seeds. Note that you can also conceive multiple spaced seeds/transition constrained seeds with our Hedera tool to fit more specific repeats if needed.
- Superiority of Spaced Seeds for Homology Search - 2 very interesting results about the number of non overlapping hits, and the asymptotic hit probability.
- Quick, Practical Selection of Effective Seeds for Homology Search - leakage model for effective seed design.
- Hardness of Optimal Spaced Seed Design - Given spaced seed sensitivity estimation (lossy filtering) or the guaranty of 100% sensitivity (lossless filter) is difficult, even for one single seed.
- Fast Computation of Good Multiple Spaced Seeds - how to speed up the seed design.
- Indel seeds for homology search - a new seed model and its waiting time formula.
- All hits all the time: parameter free calculation of seed sensitivity - or how to compute (with a single polynom) seed sensitivity for any similarity parameter.
- BDD-Based Analysis of Gapped q-Gram Filters - lossless filter and analysis of such seeds.
- Ribosomal RNA as molecular barcodes: a simple correlation analysis without sequence alignment - Composition Vector Analysis (method without alignments) to construct phylogeny trees.
- Métagénomique Bactérienne et Virale - sur le domaine plein d'avenir qu'est la métagénomique.
- A Minimum Cost Process in Searching for a Set of Similar DNA Sequences - KMP filtering method applied to DNA search.
- Wikipedia sequence alignment software - nice resource for tools.
- Wabim list of software - an impressive (and up to date) list of biological tools.
- MIGale Bioinformatics platform - by the INRA MIG team.