|
|
eomega |
Please help by correcting and extending the Wiki pages.
Clustal-Omega (clustalo) is a general purpose multiple sequence alignment (MSA) program for proteins. It produces high quality MSAs and is capable of handling data-sets of hundreds of thousands of sequences in reasonable time.
In its current form Clustal-Omega can only align protein sequences but not DNA/RNA sequences. It is envisioned that DNA/RNA will become available in a future version.
% eomega globins.fasta Multiple sequence alignment (ClustalO wrapper) (aligned) output sequence set [globins.aln]: |
Go to the input files for this example
Go to the output files for this example
Multiple sequence alignment (ClustalO wrapper) Version: EMBOSS:6.6.0.0 Standard (Mandatory) qualifiers: [-sequences] seqset File containing sequences to align [-outseq] seqoutset [ |
| Qualifier | Type | Description | Allowed values | Default | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Standard (Mandatory) qualifiers | ||||||||||||||||||
| [-sequences] (Parameter 1) |
seqset | File containing sequences to align | Readable set of sequences | Required | ||||||||||||||
| [-outseq] (Parameter 2) |
seqoutset | Sequence set filename and optional format (output USA) | Writeable sequences | <*>.format | ||||||||||||||
| Additional (Optional) qualifiers | ||||||||||||||||||
| (none) | ||||||||||||||||||
| Advanced (Unprompted) qualifiers | ||||||||||||||||||
| -indistfile | infile | Pairwise distance matrix input file (skips distance computation) | Input file | Required | ||||||||||||||
| -inguidefile | infile | Guide tree input file (skips distance computation and guide tree clustering step) | Input file | Required | ||||||||||||||
| -dealign | toggle | Dealign input sequences | Toggle value Yes/No | No | ||||||||||||||
| -cluster | list | Method |
|
mbed | ||||||||||||||
| -maxiterations | integer | Number of (combined guide tree/HMM) iterations | Integer from 0 to 2000000000 | 0 | ||||||||||||||
| -maxgiterations | integer | Maximum guide tree iterations | Integer from 0 to 2000000000 | 2000000000 | ||||||||||||||
| -maxhiterations | integer | Maximum number of HMM iterations | Integer from 0 to 2000000000 | 2000000000 | ||||||||||||||
| -maxseqs | integer | Maximum number of sequences | Integer from 2 to 2000000000 | 2000000000 | ||||||||||||||
| -maxlenseq | integer | Maximum length of sequence | Integer from 1 to 2000000000 | 2000000000 | ||||||||||||||
| -self | toggle | Set options automatically (might overwrite some options | Toggle value Yes/No | No | ||||||||||||||
| -outformat | list | Format |
|
fasta | ||||||||||||||
| -outdistfile | outfile | Pairwise distance matrix output file, only available in cluster mode 'full' | Output file | <*>.eomega | ||||||||||||||
| -outguidefile | outfile | Guide tree output file | Output file | <*>.eomega | ||||||||||||||
| -log | toggle | Log progress to standard output if not used for output | Toggle value Yes/No | No | ||||||||||||||
| Associated qualifiers | ||||||||||||||||||
| "-sequences" associated seqset qualifiers | ||||||||||||||||||
| -sbegin1 -sbegin_sequences |
integer | Start of each sequence to be used | Any integer value | 0 | ||||||||||||||
| -send1 -send_sequences |
integer | End of each sequence to be used | Any integer value | 0 | ||||||||||||||
| -sreverse1 -sreverse_sequences |
boolean | Reverse (if DNA) | Boolean value Yes/No | N | ||||||||||||||
| -sask1 -sask_sequences |
boolean | Ask for begin/end/reverse | Boolean value Yes/No | N | ||||||||||||||
| -snucleotide1 -snucleotide_sequences |
boolean | Sequence is nucleotide | Boolean value Yes/No | N | ||||||||||||||
| -sprotein1 -sprotein_sequences |
boolean | Sequence is protein | Boolean value Yes/No | N | ||||||||||||||
| -slower1 -slower_sequences |
boolean | Make lower case | Boolean value Yes/No | N | ||||||||||||||
| -supper1 -supper_sequences |
boolean | Make upper case | Boolean value Yes/No | N | ||||||||||||||
| -scircular1 -scircular_sequences |
boolean | Sequence is circular | Boolean value Yes/No | N | ||||||||||||||
| -squick1 -squick_sequences |
boolean | Read id and sequence only | Boolean value Yes/No | N | ||||||||||||||
| -sformat1 -sformat_sequences |
string | Input sequence format | Any string | |||||||||||||||
| -iquery1 -iquery_sequences |
string | Input query fields or ID list | Any string | |||||||||||||||
| -ioffset1 -ioffset_sequences |
integer | Input start position offset | Any integer value | 0 | ||||||||||||||
| -sdbname1 -sdbname_sequences |
string | Database name | Any string | |||||||||||||||
| -sid1 -sid_sequences |
string | Entryname | Any string | |||||||||||||||
| -ufo1 -ufo_sequences |
string | UFO features | Any string | |||||||||||||||
| -fformat1 -fformat_sequences |
string | Features format | Any string | |||||||||||||||
| -fopenfile1 -fopenfile_sequences |
string | Features file name | Any string | |||||||||||||||
| "-outseq" associated seqoutset qualifiers | ||||||||||||||||||
| -osformat2 -osformat_outseq |
string | Output seq format | Any string | |||||||||||||||
| -osextension2 -osextension_outseq |
string | File name extension | Any string | |||||||||||||||
| -osname2 -osname_outseq |
string | Base file name | Any string | |||||||||||||||
| -osdirectory2 -osdirectory_outseq |
string | Output directory | Any string | |||||||||||||||
| -osdbname2 -osdbname_outseq |
string | Database name to add | Any string | |||||||||||||||
| -ossingle2 -ossingle_outseq |
boolean | Separate file for each entry | Boolean value Yes/No | N | ||||||||||||||
| -oufo2 -oufo_outseq |
string | UFO features | Any string | |||||||||||||||
| -offormat2 -offormat_outseq |
string | Features format | Any string | |||||||||||||||
| -ofname2 -ofname_outseq |
string | Features file name | Any string | |||||||||||||||
| -ofdirectory2 -ofdirectory_outseq |
string | Output directory | Any string | |||||||||||||||
| "-outdistfile" associated outfile qualifiers | ||||||||||||||||||
| -odirectory | string | Output directory | Any string | |||||||||||||||
| "-outguidefile" associated outfile qualifiers | ||||||||||||||||||
| -odirectory | string | Output directory | Any string | |||||||||||||||
| General qualifiers | ||||||||||||||||||
| -auto | boolean | Turn off prompts | Boolean value Yes/No | N | ||||||||||||||
| -stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | ||||||||||||||
| -filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | ||||||||||||||
| -options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | ||||||||||||||
| -debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | ||||||||||||||
| -verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | ||||||||||||||
| -help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | ||||||||||||||
| -warning | boolean | Report warnings | Boolean value Yes/No | Y | ||||||||||||||
| -error | boolean | Report errors | Boolean value Yes/No | Y | ||||||||||||||
| -fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | ||||||||||||||
| -die | boolean | Report dying program messages | Boolean value Yes/No | Y | ||||||||||||||
| -version | boolean | Report version number and exit | Boolean value Yes/No | N | ||||||||||||||
>HBB_HUMAN Sw:Hbb_Human => HBB_HUMAN VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKV KAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGK EFTPPVQAAYQKVVAGVANALAHKYH >HBB_HORSE Sw:Hbb_Horse => HBB_HORSE VQLSGEEKAAVLALWDKVNEEEVGGEALGRLLVVYPWTQRFFDSFGDLSNPGAVMGNPKV KAHGKKVLHSFGEGVHHLDNLKGTFAALSELHCDKLHVDPENFRLLGNVLVVVLARHFGK DFTPELQASYQKVVAGVANALAHKYH >HBA_HUMAN Sw:Hba_Human => HBA_HUMAN VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGK KVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPA VHASLDKFLASVSTVLTSKYR >HBA_HORSE Sw:Hba_Horse => HBA_HORSE VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDLSHGSAQVKAHGK KVGDALTLAVGHLDDLPGALSNLSDLHAHKLRVDPVNFKLLSHCLLSTLAVHLPNDFTPA VHASLDKFLSSVSTVLTSKYR >MYG_PHYCA Sw:Myg_Phyca => MYG_PHYCA VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKTEAEMKASED LKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHP GDFGADAQGAMNKALELFRKDIAAKYKELGYQG >GLB5_PETMA Sw:Glb5_Petma => GLB5_PETMA PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTT ADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRDLSGKHAKSFQVDPQYFKVLA AVIADTVAAGDAGFEKLMSMICILLRSAY >LGB2_LUPLU Sw:Lgb2_Luplu => LGB2_LUPLU GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGTSEVPQNNPEL QAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKGVADAHFPVVKEAILKTIKE VVGAKWSEELNSAWTIAYDELAIVIKKEMNDAA |
>HBB_HUMAN Sw:Hbb_Human => HBB_HUMAN --------VHLTPEEKSAVTALWGKVNV--DEVGGEALGRLLVVYPWTQRFFESFGDLST PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLK--G---TFATLSELHCDKLHVDPENFRL LGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH------ >HBB_HORSE Sw:Hbb_Horse => HBB_HORSE --------VQLSGEEKAAVLALWDKVNE--EEVGGEALGRLLVVYPWTQRFFDSFGDLSN PGAVMGNPKVKAHGKKVLHSFGEGVHHLDNLK--G---TFAALSELHCDKLHVDPENFRL LGNVLVVVLARHFGKDFTPELQASYQKVVAGVANALAHKYH------ >HBA_HUMAN Sw:Hba_Human => HBA_HUMAN ---------VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDL--- ---SHGSAQVKGHGKKVADALTNAVAHVDDMP--N---ALSALSDLHAHKLRVDPVNFKL LSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR------ >HBA_HORSE Sw:Hba_Horse => HBA_HORSE ---------VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDL--- ---SHGSAQVKAHGKKVGDALTLAVGHLDDLP--G---ALSNLSDLHAHKLRVDPVNFKL LSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR------ >MYG_PHYCA Sw:Myg_Phyca => MYG_PHYCA ---------VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKT EAEMKASEDLKKHGVTVLTALGAILKKKGHHE--A---ELKPLAQSHATKHKIPIKYLEF ISEAIIHVLHSRHPGDFGADAQGAMNKALELFRKDIAAKYKELGYQG >GLB5_PETMA Sw:Glb5_Petma => GLB5_PETMA PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTT ADQLKKSADVRWHAERIINAVNDAVASMDDTE--KMSMKLRDLSGKHAKSFQVDPQYFKV LAAVIADTVAA---------GDAGFEKLMSMICILLRSAY------- >LGB2_LUPLU Sw:Lgb2_Luplu => LGB2_LUPLU --------GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGTSE --VPQNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKGV-ADAHFPV VKEAILKTIKEVVGAKWSEELNSAWTIAYDELAIVIKKEMNDAA--- |
[2] Blackshields G, Sievers F, Shi W, Wilm A, Higgins DG. Sequence embedding for fast construction of guide trees for multiple sequence alignment. Algorithms Mol Biol. 2010 May 14;5:21.
[3] http://www.genetics.wustl.edu/eddy/software/#squid
[4] Wilbur and Lipman, 1983; PMID 6572363
[5] Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673-4680.
[6] Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.
[7] Kimura M (1980). "A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences". Journal of Molecular Evolution 16: 111–120.
[8] Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput.Nucleic Acids Res. 32(5):1792-1797.
| Program name | Description |
|---|---|
| edialign | Local multiple alignment of sequences |
| emma | Multiple sequence alignment (ClustalW wrapper) |
| eomegapp | Profile with profile (ClustalO wrapper) |
| eomegaps | Single sequence with profile (ClustalO wrapper) |
| eomegash | Sequence with HMM (ClustalO wrapper) |
| eomegasp | Sequence with profile (ClustalO wrapper) |
| infoalign | Display basic information about a multiple sequence alignment |
| mse | Multiple sequence editor |
| plotcon | Plot conservation of a sequence alignment |
| prettyplot | Draw a sequence alignment with pretty formatting |
| showalign | Display a multiple sequence alignment in pretty format |
| tranalign | Generate an alignment of nucleic coding regions from aligned proteins |
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.