eomega |
Please help by correcting and extending the Wiki pages.
Clustal-Omega (clustalo) is a general purpose multiple sequence alignment (MSA) program for proteins. It produces high quality MSAs and is capable of handling data-sets of hundreds of thousands of sequences in reasonable time.
In its current form Clustal-Omega can only align protein sequences but not DNA/RNA sequences. It is envisioned that DNA/RNA will become available in a future version.
% eomega globins.fasta Multiple sequence alignment (ClustalO wrapper) (aligned) output sequence set [globins.aln]: |
Go to the input files for this example
Go to the output files for this example
Multiple sequence alignment (ClustalO wrapper) Version: EMBOSS:6.6.0.0 Standard (Mandatory) qualifiers: [-sequences] seqset File containing sequences to align [-outseq] seqoutset [ |
Qualifier | Type | Description | Allowed values | Default | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Standard (Mandatory) qualifiers | ||||||||||||||||||
[-sequences] (Parameter 1) |
seqset | File containing sequences to align | Readable set of sequences | Required | ||||||||||||||
[-outseq] (Parameter 2) |
seqoutset | Sequence set filename and optional format (output USA) | Writeable sequences | <*>.format | ||||||||||||||
Additional (Optional) qualifiers | ||||||||||||||||||
(none) | ||||||||||||||||||
Advanced (Unprompted) qualifiers | ||||||||||||||||||
-indistfile | infile | Pairwise distance matrix input file (skips distance computation) | Input file | Required | ||||||||||||||
-inguidefile | infile | Guide tree input file (skips distance computation and guide tree clustering step) | Input file | Required | ||||||||||||||
-dealign | toggle | Dealign input sequences | Toggle value Yes/No | No | ||||||||||||||
-cluster | list | Method |
|
mbed | ||||||||||||||
-maxiterations | integer | Number of (combined guide tree/HMM) iterations | Integer from 0 to 2000000000 | 0 | ||||||||||||||
-maxgiterations | integer | Maximum guide tree iterations | Integer from 0 to 2000000000 | 2000000000 | ||||||||||||||
-maxhiterations | integer | Maximum number of HMM iterations | Integer from 0 to 2000000000 | 2000000000 | ||||||||||||||
-maxseqs | integer | Maximum number of sequences | Integer from 2 to 2000000000 | 2000000000 | ||||||||||||||
-maxlenseq | integer | Maximum length of sequence | Integer from 1 to 2000000000 | 2000000000 | ||||||||||||||
-self | toggle | Set options automatically (might overwrite some options | Toggle value Yes/No | No | ||||||||||||||
-outformat | list | Format |
|
fasta | ||||||||||||||
-outdistfile | outfile | Pairwise distance matrix output file, only available in cluster mode 'full' | Output file | <*>.eomega | ||||||||||||||
-outguidefile | outfile | Guide tree output file | Output file | <*>.eomega | ||||||||||||||
-log | toggle | Log progress to standard output if not used for output | Toggle value Yes/No | No | ||||||||||||||
Associated qualifiers | ||||||||||||||||||
"-sequences" associated seqset qualifiers | ||||||||||||||||||
-sbegin1 -sbegin_sequences |
integer | Start of each sequence to be used | Any integer value | 0 | ||||||||||||||
-send1 -send_sequences |
integer | End of each sequence to be used | Any integer value | 0 | ||||||||||||||
-sreverse1 -sreverse_sequences |
boolean | Reverse (if DNA) | Boolean value Yes/No | N | ||||||||||||||
-sask1 -sask_sequences |
boolean | Ask for begin/end/reverse | Boolean value Yes/No | N | ||||||||||||||
-snucleotide1 -snucleotide_sequences |
boolean | Sequence is nucleotide | Boolean value Yes/No | N | ||||||||||||||
-sprotein1 -sprotein_sequences |
boolean | Sequence is protein | Boolean value Yes/No | N | ||||||||||||||
-slower1 -slower_sequences |
boolean | Make lower case | Boolean value Yes/No | N | ||||||||||||||
-supper1 -supper_sequences |
boolean | Make upper case | Boolean value Yes/No | N | ||||||||||||||
-scircular1 -scircular_sequences |
boolean | Sequence is circular | Boolean value Yes/No | N | ||||||||||||||
-squick1 -squick_sequences |
boolean | Read id and sequence only | Boolean value Yes/No | N | ||||||||||||||
-sformat1 -sformat_sequences |
string | Input sequence format | Any string | |||||||||||||||
-iquery1 -iquery_sequences |
string | Input query fields or ID list | Any string | |||||||||||||||
-ioffset1 -ioffset_sequences |
integer | Input start position offset | Any integer value | 0 | ||||||||||||||
-sdbname1 -sdbname_sequences |
string | Database name | Any string | |||||||||||||||
-sid1 -sid_sequences |
string | Entryname | Any string | |||||||||||||||
-ufo1 -ufo_sequences |
string | UFO features | Any string | |||||||||||||||
-fformat1 -fformat_sequences |
string | Features format | Any string | |||||||||||||||
-fopenfile1 -fopenfile_sequences |
string | Features file name | Any string | |||||||||||||||
"-outseq" associated seqoutset qualifiers | ||||||||||||||||||
-osformat2 -osformat_outseq |
string | Output seq format | Any string | |||||||||||||||
-osextension2 -osextension_outseq |
string | File name extension | Any string | |||||||||||||||
-osname2 -osname_outseq |
string | Base file name | Any string | |||||||||||||||
-osdirectory2 -osdirectory_outseq |
string | Output directory | Any string | |||||||||||||||
-osdbname2 -osdbname_outseq |
string | Database name to add | Any string | |||||||||||||||
-ossingle2 -ossingle_outseq |
boolean | Separate file for each entry | Boolean value Yes/No | N | ||||||||||||||
-oufo2 -oufo_outseq |
string | UFO features | Any string | |||||||||||||||
-offormat2 -offormat_outseq |
string | Features format | Any string | |||||||||||||||
-ofname2 -ofname_outseq |
string | Features file name | Any string | |||||||||||||||
-ofdirectory2 -ofdirectory_outseq |
string | Output directory | Any string | |||||||||||||||
"-outdistfile" associated outfile qualifiers | ||||||||||||||||||
-odirectory | string | Output directory | Any string | |||||||||||||||
"-outguidefile" associated outfile qualifiers | ||||||||||||||||||
-odirectory | string | Output directory | Any string | |||||||||||||||
General qualifiers | ||||||||||||||||||
-auto | boolean | Turn off prompts | Boolean value Yes/No | N | ||||||||||||||
-stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | ||||||||||||||
-filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | ||||||||||||||
-options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | ||||||||||||||
-debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | ||||||||||||||
-verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | ||||||||||||||
-help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | ||||||||||||||
-warning | boolean | Report warnings | Boolean value Yes/No | Y | ||||||||||||||
-error | boolean | Report errors | Boolean value Yes/No | Y | ||||||||||||||
-fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | ||||||||||||||
-die | boolean | Report dying program messages | Boolean value Yes/No | Y | ||||||||||||||
-version | boolean | Report version number and exit | Boolean value Yes/No | N |
>HBB_HUMAN Sw:Hbb_Human => HBB_HUMAN VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKV KAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGK EFTPPVQAAYQKVVAGVANALAHKYH >HBB_HORSE Sw:Hbb_Horse => HBB_HORSE VQLSGEEKAAVLALWDKVNEEEVGGEALGRLLVVYPWTQRFFDSFGDLSNPGAVMGNPKV KAHGKKVLHSFGEGVHHLDNLKGTFAALSELHCDKLHVDPENFRLLGNVLVVVLARHFGK DFTPELQASYQKVVAGVANALAHKYH >HBA_HUMAN Sw:Hba_Human => HBA_HUMAN VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGK KVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPA VHASLDKFLASVSTVLTSKYR >HBA_HORSE Sw:Hba_Horse => HBA_HORSE VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDLSHGSAQVKAHGK KVGDALTLAVGHLDDLPGALSNLSDLHAHKLRVDPVNFKLLSHCLLSTLAVHLPNDFTPA VHASLDKFLSSVSTVLTSKYR >MYG_PHYCA Sw:Myg_Phyca => MYG_PHYCA VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKTEAEMKASED LKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHP GDFGADAQGAMNKALELFRKDIAAKYKELGYQG >GLB5_PETMA Sw:Glb5_Petma => GLB5_PETMA PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTT ADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQYFKVLA AVIADTVAAGDAGFEKLMSMICILLRSAY >LGB2_LUPLU Sw:Lgb2_Luplu => LGB2_LUPLU GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGTSEVPQNNPEL QAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKGVADAHFPVVKEAILKTIKE VVGAKWSEELNSAWTIAYDELAIVIKKEMNDAA |
>HBB_HUMAN Sw:Hbb_Human => HBB_HUMAN --------VHLTPEEKSAVTALWGKVNV--DEVGGEALGRLLVVYPWTQRFFESFGDLST PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLK--G---TFATLSELHCDKLHVDPENFRL LGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH------ >HBB_HORSE Sw:Hbb_Horse => HBB_HORSE --------VQLSGEEKAAVLALWDKVNE--EEVGGEALGRLLVVYPWTQRFFDSFGDLSN PGAVMGNPKVKAHGKKVLHSFGEGVHHLDNLK--G---TFAALSELHCDKLHVDPENFRL LGNVLVVVLARHFGKDFTPELQASYQKVVAGVANALAHKYH------ >HBA_HUMAN Sw:Hba_Human => HBA_HUMAN ---------VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDL--- ---SHGSAQVKGHGKKVADALTNAVAHVDDMP--N---ALSALSDLHAHKLRVDPVNFKL LSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR------ >HBA_HORSE Sw:Hba_Horse => HBA_HORSE ---------VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDL--- ---SHGSAQVKAHGKKVGDALTLAVGHLDDLP--G---ALSNLSDLHAHKLRVDPVNFKL LSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR------ >MYG_PHYCA Sw:Myg_Phyca => MYG_PHYCA ---------VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKT EAEMKASEDLKKHGVTVLTALGAILKKKGHHE--A---ELKPLAQSHATKHKIPIKYLEF ISEAIIHVLHSRHPGDFGADAQGAMNKALELFRKDIAAKYKELGYQG >GLB5_PETMA Sw:Glb5_Petma => GLB5_PETMA PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTT ADQLKKSADVRWHAERIINAVNDAVASMDDTE--KMSMKLRDLSGKHAKSFQVDPQYFKV LAAVIADTVAA---------GDAGFEKLMSMICILLRSAY------- >LGB2_LUPLU Sw:Lgb2_Luplu => LGB2_LUPLU --------GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGTSE --VPQNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKGV-ADAHFPV VKEAILKTIKEVVGAKWSEELNSAWTIAYDELAIVIKKEMNDAA--- |
[2] Blackshields G, Sievers F, Shi W, Wilm A, Higgins DG. Sequence embedding for fast construction of guide trees for multiple sequence alignment. Algorithms Mol Biol. 2010 May 14;5:21.
[3] http://www.genetics.wustl.edu/eddy/software/#squid
[4] Wilbur and Lipman, 1983; PMID 6572363
[5] Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673-4680.
[6] Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.
[7] Kimura M (1980). "A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences". Journal of Molecular Evolution 16: 111–120.
[8] Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput.Nucleic Acids Res. 32(5):1792-1797.
Program name | Description |
---|---|
edialign | Local multiple alignment of sequences |
emma | Multiple sequence alignment (ClustalW wrapper) |
eomegapp | Profile with profile (ClustalO wrapper) |
eomegaps | Single sequence with profile (ClustalO wrapper) |
eomegash | Sequence with HMM (ClustalO wrapper) |
eomegasp | Sequence with profile (ClustalO wrapper) |
infoalign | Display basic information about a multiple sequence alignment |
mse | Multiple sequence editor |
plotcon | Plot conservation of a sequence alignment |
prettyplot | Draw a sequence alignment with pretty formatting |
showalign | Display a multiple sequence alignment in pretty format |
tranalign | Generate an alignment of nucleic coding regions from aligned proteins |
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.