|
EMBOSS: C2 Applications
|
Required by Sanger Centre
The following applications are specified in the original EMBOSS grant
application. They replace obsolete EGCG applications in routine use
at the Sanger Centre.
- EST clustering (findoverlap, checkoverlap, clusteroverlap,
alignoverlap) originally writen for CpG island clustering, now
replaced by other utilities and by
- seqmatchall
Does an all-against-all comparison of a set of sequences
- cluster
DNA sequence clustering
- Rapid database searching with sequence patterns (stssearch)
- stssearch
Searches a DNA database for matches with a set of STS primers
- Rapid database searching for sequence overlaps (equickindex, equicksearch, quickmatch)
- to
be replaced by blastn with post processing.
- Repeat identification (etandem, equicktandem, einverted)
- Nucleotide pattern analysis (cpgspans, cpgplot)
- Codon usage analysis for small genomes (codfish)
- chips
Codon usage statistics
- Gene identification tools (est_genome)
- Rapid identification of sequence patterns in large sequence sets
(wordup, wordcount, chaos, polydot)
- wordcount
Counts words of a specified size in a DNA sequence
- chaos
Create a chaos game representation plot for a sequence
- polydot
Displays all-against-all dotplots of a set of sequences
- Protein motif identification and domain analysis (pepcoil,
antigenic, sigcleave)
- pepcoil
Predicts coiled coil regions
- antigenic
Finds antigenic sites in proteins
- sigcleave
Reports protein signal cleavage sites
- Presentation tools for publication (prettyplot)
- prettyplot
Displays aligned sequences, with colouring and boxing
First Applications
In addition to many new applications, some useful utilities have
been written to test the library code and will be of general use.
See the EMBOSS
Applications web pages for a complete list of applications in
the current release.
- Sequence format conversion
- seqret
Reads and writes (returns) a sequence
- seqretset
Reads and writes (returns) a set of sequences all at once
- seqretall
Reads and writes (returns) a set of sequences one at a time
- seqretfeat
Reads and writes (returns) a sequence with a feature table
- seqretallfeat
Reads and writes (returns) a set of sequences one at a time, with feature tables.
- Local alignment
- water
Smith-Waterman local alignment
- matcher
Finds the best local alignments between two sequences
- simplesw
Simple Smith-Waterman alignment
- Global alignment
- needle
Needleman-Wunsch global alignment
- stretcher
Finds the best global alignment between two sequences
- Multiple alignment
- emma
Multiple alignment program - interface to ClustalW program
- Sequence comparison
- Codon usage
- codcmp
Codon usage table comparison
- syco
Synonymous codon usage Gribskov statistic plot
- cusp
Create a codon usage table
- Pattern Matching
- patmatdb
Search a protein sequence database with a motif
- patmatmotifs
Search a motif database with a protein sequence
- profit
Scan a sequence or database with a matrix or profile
- prophecy
Creates matrices/profiles from multiple alignments
- prosextract
Builds the motif database for patmatmotifs to search
- tfscan
Scans DNA sequences for transcription factors
- redata
Search REBASE for enzyme name, references, suppliers etc
- restrict
Finds restriction enzyme cleavage sites
- digest
Protein proteolytic enzyme or reagent cleavage digest
- palindrome
Looks for inverted repeats in a nucleotide sequence
- fuzznuc
Nucleic acid pattern search
- fuzzpro
Protein pattern search
- Sequence reports
- prettyseq
Output sequence with translated ranges
- showorf
Pretty output of DNA translations
- compseq
Counts the composition of dimer/trimer/etc words in a sequence
- helixturnhelix>
Report nucleic acid binding motifs
- garnier
GARNIER predicts protein secondary structure.
- complex
Find the linguistic complexity in nucleotide sequences
- Sequence graphs
- Mutation
- msbar
Mutate sequence beyond all recognition
- shuffleseq
Shuffles a set of sequences maintaining composition
- Translation
- backtranseq
Back translate a protein sequence
- getorf
Finds and extracts open reading frames (ORFs)
- transeq
Translate nucleic acid sequences
See also the EMBOSS
Applications web pages and the New
Applications topic.