What is YASS
YASS is a genomic similarity search tool, for nucleic (DNA/RNA) sequences in fasta or plain text format (it produces local pairwise alignments). This tool is currently developed and maintained by L. Noe and G. Kucherov of the LIFL / INRIA Lille-Nord Europe Sequoia group. Like most of the heuristic DNA local alignment software (BLAST, FASTA, PATTERNHUNTER, BLASTZ ...), YASS uses seeds to detect potential similarity regions, and then tries to extend them to actual alignments. This genomic search tool uses multiple transition constrained spaced seeds that enable to search more fuzzy repeats, as non-coding DNA/RNA. Another simple, but interesting feature is that you can specify the seed pattern used in the search step (as provided by hedera or iedera).
- multiple, possibly overlapping seeds and a new hit criterion to ensure a good sensitivity/selectivity trade-off
- transition-constrained spaced seeds to improve
sensitivity (transition mutations are purine to purine [
A<->G] or pyrimidine to pyrimidine [
- using different scoring schemes with bit-score and E-value evaluated according to the sequence background frequencies
- parameterizable output filter for low complexity repeats
- reporting of various alignment statistical parameters (mutation bias along triplets, transition/transversion)
- post-processing step to group gapped alignments
YASS produces the common blast tabular output files ("-d 3" option). Moreover, a small perl script is provided to transform the original extended output ("-d 1" option) into blast output with alignments (which can be directly parsed with Bioperl), axt files, or fasta alignments : yass2blast.pl
If you use this program in your work, please cite :
- L. Noe, G. Kucherov, YASS: enhancing the sensitivity of DNA similarity search, 2005, Nucleic Acids Research, 33(2):W540-W543.
Of course, your are welcome and always appreciated!
Query YASS Online
You can query YASS from this server: yass.php.
- paste or upload your sequences, or use the proposed ones,
- control several of the YASS parameters,
- produce a dot-plot view of the alignments / a tabular view of the complete output,
- download the result as a yass/blast/axt/fasta output file,
- run Carnac/Protea, Magnolia, a annotation Blast, a multiple alignment Clustalw of Muscle, or Mfold, on a simple click.
Papers/Links that use/cite YASS
a non-exhaustive list of papers ... i also try to maintain a (try to be) exhaustive list of "spaced seeds" related papers for those interested in this subject. Please let me know if i forget someone ...
- The tRNAarg Gene and engA Are Essential Genes on the 1.7-Mb pSymB Megaplasmid of Sinorhizobium meliloti and Were Translocated Together from the Chromosome in an Ancestral Strain - Dot plots were generated with YASS.
- The macrosatellite DXZ4 mediates CTCF-dependent long-range intrachromosomal interactions on the human inactive X chromosome - Pairwise regions of Dxz4 were aligned using YASS.
- The mouse DXZ4 homolog retains Ctcf binding and proximity to Pls3 despite substantial organizational differences compared to the primate macrosatellite - Pairwise regions of Dxz4 were aligned using YASS.
- Ortholog Alleles at Xa3/Xa26 Locus Confer Conserved Race-Specific Resistance against Xanthomonas oryzae in Rice - YASS was used to produce the Pair-wise sequence comparisons of Xa3/Xa26.
- RNAspace.org: An integrated environment for the prediction, annotation, and analysis of ncRNA - YASS is used to produce alignments of RNAspace.
- Uncovering the Prevalence and Diversity of Integrating Conjugative Elements in Actinobacteria - YASS was used to identify the direct repeats of the attL and attR attachment sites.
- Detection of small RNAs in Bordetella pertussis and identification of a novel repeated genetic element - YASS was used to indentify Bordetella pertussis sRNAs.
- The human gut virome: Inter-individual variation and dynamic response to diet - YASS was used to assess similarity between VLP contigs.
- Homoplastic microinversions and the avian tree of life - YASS was used together with Blast to detect microinversions.
- An alternative approach to multiple genome comparison - YASS is used (together with other tools) to compute initial pairwise alignments.
- Structural and content diversity of mitochondrial genome in beet: a comparative genomic analysis - YASS was used for various analyses (chimeric ORFs detection, duplication detection, ...).
- Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes - YASS was used together with Blast to predict dif/XerCD systems.
- Cis-regulatory characterization of sequence conservation surrounding the Hox4 genes - YASS was used to determine CNEs alignments in zebrafish.
- A scenario of mitochondrial genome evolution in maize based on rearrangement events - YASS was used to compute markers.
- Sequence analysis of two alleles reveals that intra-and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo) - YASS was used to align the two alleles sequences.
- dif/Xer Recombination Systems in Proteobacteria - YASS was used to identify dif-related sequences in proteobacteria.
- Evolutionary dynamics of a locus of fertility restoration in plants - YASS was used to analyse the similarity between alleles and inside the studied allele.
- Mapping of 5q35 chromosomal rearrangements within a genomically unstable region - YASS was used to search for similarities, and YASS dotplot provided positional information.
- The genome sequence of the model ascomycete fungus Podospora anserina - YASS was used together with Erpin and Blast to detect non-coding RNAs.
- Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula - YASS was used to align Transposable elements.
- Mammalian Small Nucleolar RNAs Are Mobile Genetic Elements - YASS was used to align genomic segments.
- Genomic architecture of human 17q21 linked to frontotemporal dementia uncovers a highly homologous family of low-copy repeats in the tau region - YASS was used to analyse contig sequences for LCR regions.
- sno-RNA database - YASS was used to produce human and yeast rDNAs alignments.
- A fast and flexible approach to oligonucleotide probe design for genomes and gene families - Transition constrained seeds were used to filter potential oligonucleotides.
- SinicView : multiple alignment comparisons - famous YASS matrix used. ;-)
- Seeds for effective oligonucleotide design - Iedera was used to design spaced seeds.
- Adaptive seeds tame genomic sequence comparison - LAST software and adaptative seeds.
- Spaced seeds for cross-species CDNA-to-genome sequence alignment - review spaced seed design and focus on cross-species sequence alignment.
- A Genomic Distance Based on MUM Indicates Discontinuity between Most Bacterial Species and Genera - maximal unique matches (MUM) based distance for full genomes.
- A survey of seeding for sequence alignment - a survey of the different approaches proposed to index genomic or amino-acid sequences.
- Run Probability of High-Order Seed Patterns and its Applications to Finding Good Transition Seeds - a new heuritic is proposed to quickly select good (either optimal) transition seeds.
- Good spaced seeds for homology search - this paper provides a list of single spaced seeds. Note that you can also conceive multiple spaced seeds/transition constrained seeds with our Hedera tool to fit more specific repeats if needed.
- Superiority of Spaced Seeds for Homology Search - 2 very interesting results about the number of non overlapping hits, and the asymptotic hit probability.
- Quick, Practical Selection of Effective Seeds for Homology Search - leakage model for effective seed design.
- Hardness of Optimal Spaced Seed Design - Given spaced seed sensitivity estimation (lossy filtering) or the guaranty of 100% sensitivity (lossless filter) is difficult, even for one single seed.
- Fast Computation of Good Multiple Spaced Seeds - how to speed up the seed design.
- Indel seeds for homology search - a new seed model and its waiting time formula.
- All hits all the time: parameter free calculation of seed sensitivity - or how to compute (with a single polynom) seed sensitivity for any similarity parameter.
- BDD-Based Analysis of Gapped q-Gram Filters - lossless filter and analysis of such seeds.
- Ribosomal RNA as molecular barcodes: a simple correlation analysis without sequence alignment - Composition Vector Analysis (method without alignments) to construct phylogeny trees.
- Métagénomique Bactérienne et Virale - sur le domaine plein d'avenir qu'est la métagénomique.
- A Minimum Cost Process in Searching for a Set of Similar DNA Sequences - KMP filtering method applied to DNA search.
- Wikipedia sequence alignment software - nice resource for tools.
- Wabim list of software - an impressive (and up to date) list of biological tools.
- MIGale Bioinformatics platform - by the INRA MIG team.
- MolecularStation Bioinformatics and Bioinformatic Tools.