|
|
esim4 |
Please help by correcting and extending the Wiki pages.
The program is available for download from http://globin.cse.psu.edu/
% esim4 Align an mRNA to a genomic DNA sequence Input nucleotide sequence: tembl:h45989 Genomic sequence: tembl:z69719 Sim4 program output file [h45989.esim4]: |
Go to the input files for this example
Go to the output files for this example
Align an mRNA to a genomic DNA sequence
Version: EMBOSS:6.6.0.0
Standard (Mandatory) qualifiers:
[-asequence] sequence Nucleotide sequence filename and optional
format, or reference (input USA)
[-bsequence] seqall Genomic sequence
[-outfile] outfile [*.esim4] Sim4 program output file
Additional (Optional) qualifiers (* if not always prompted):
-word integer [12] Sets the word size (W) for blast hits
(Any integer value)
-extend integer [12] Sets the word extension termination
limit (X) for the blast-like stage of the
algorithm. (Any integer value)
-cutoff integer [3] Sets the cutoff (E) in range [3,10].
(Integer up to 10)
-useramsp toggle [N] False: esim4 calculates mspA, True:
value from mspA command line argument.
* -amsp integer [16] MSP score threshold (K) for the first
stage of the algorithm. (If this option is
not specified, the threshold is computed
from the lengths of the sequences, using
statistical criteria.) For example, a good
value for genomic sequences in the range of
a few hundred Kb is 16. To avoid spurious
matches, however, a larger value may be
needed for longer sequences. (Any integer
value)
-userbmsp toggle [N] False: esim4 calculates mspB, True:
value from mspB command line argument.
* -bmsp integer [12] Sets the threshold for the MSP scores
(C) when aligning the as-yet-unmatched
fragments, during the second stage of the
algorithm. By default, the smaller of the
constant 12 and a statistics-based threshold
is chosen. (Any integer value)
-weight integer [0] Weight value (H) (undocumented). 0 uses
a default, >0 is a value (Any integer value)
-diagonal integer [10] Bound (K) for the diagonal distance
within consecutive MSPs in an exon. (Any
integer value)
-strand menu [both] This determines the strand of the
genome (R) with which the mRNA will be
aligned. The default value is 'both', in
which case both strands of the genome are
attempted. The other allowed modes are
'forward' and 'reverse'. (Values: both (Both
strands); forward (Forward strand only);
reverse (Reverse strand only))
-format integer [0] Sets the output format (A). Exon
endpoints only (format=0), exon endpoints
and boundaries of the coding region (CDS) in
the genomic sequence, when specified for
the input mRNA (-format=5), alignment text
(-format=1), alignment in lav-block format
(-format=2), or both exon endpoints and
alignment text (-format=3 or -format=4). If
a reverse complement match is found,
-format=0,1,2,3,5 will give its position in
the plus strand of the longer sequence and
the minus strand of the shorter sequence.
-format=4 will give its position in the plus
strand of the first sequence (mRNA) and the
minus strand of the second sequence
(genome), regardless of which sequence is
longer. The -format=5 option can be used
with the S command line option to specify
the endpoints of the CDS in the mRNA, and
produces output in the exons file format
required by PipMaker. (Integer from 0 to 5)
-cliptails boolean [N] Trim poly-A tails (P). Specifies whether
or not the program should report the
fragment of the alignment containing the
poly-A tail (if found). By default
(-nocliptails) the alignment is displayed as
computed. When this feature is enabled
(-cliptails), sim4 will remove the poly-A
tails and all format options will produce
additional lav alignment headers.
-smallexons boolean [N] Requests an additional search for small
marginal exons (N) (N=1) guided by the
splice-site recognition signals. This option
can be used when a high accuracy match is
expected. The default value is N=0,
specifying no additional search.
-[no]ambiguity boolean [Y] Controls the set of characters allowed
in the input sequences (B). By default
(-ambiguity), ambiguity characters
(ABCDGHKMNRSTVWXY) are allowed. By
specifying -noambiguity, the set of
acceptable characters is restricted to
A,C,G,T,N and X only.
-cdsregion string Allows the user to specify the endpoints of
the CDS in the input mRNA (S), with the
syntax: -cdsregion=n1..n2. This option is
only available with the -format=5 flag,
which produces output in the format required
by PipMaker. Alternatively, the CDS
coordinates could appear in a construct
CDS=n1..n2 in the FastA header of the mRNA
sequence. When the second file is an mRNA
database, the command line specification for
the CDS will apply to the first sequence in
the file only. (Any string)
-aoffset integer [0] Undocumented (f) - some sort of offset
in first sequence. (Any integer value)
-boffset integer [0] Undocumented (F) - some sort of offset
in second sequence. (Any integer value)
-toa integer [0] Undocumented (t)- offset end of first
sequence? (Any integer value)
-tob integer [0] Undocumented (T) - offset end of second
sequence? (Any integer value)
Advanced (Unprompted) qualifiers: (none)
Associated qualifiers:
"-asequence" associated qualifiers
-sbegin1 integer Start of the sequence to be used
-send1 integer End of the sequence to be used
-sreverse1 boolean Reverse (if DNA)
-sask1 boolean Ask for begin/end/reverse
-snucleotide1 boolean Sequence is nucleotide
-sprotein1 boolean Sequence is protein
-slower1 boolean Make lower case
-supper1 boolean Make upper case
-scircular1 boolean Sequence is circular
-squick1 boolean Read id and sequence only
-sformat1 string Input sequence format
-iquery1 string Input query fields or ID list
-ioffset1 integer Input start position offset
-sdbname1 string Database name
-sid1 string Entryname
-ufo1 string UFO features
-fformat1 string Features format
-fopenfile1 string Features file name
"-bsequence" associated qualifiers
-sbegin2 integer Start of each sequence to be used
-send2 integer End of each sequence to be used
-sreverse2 boolean Reverse (if DNA)
-sask2 boolean Ask for begin/end/reverse
-snucleotide2 boolean Sequence is nucleotide
-sprotein2 boolean Sequence is protein
-slower2 boolean Make lower case
-supper2 boolean Make upper case
-scircular2 boolean Sequence is circular
-squick2 boolean Read id and sequence only
-sformat2 string Input sequence format
-iquery2 string Input query fields or ID list
-ioffset2 integer Input start position offset
-sdbname2 string Database name
-sid2 string Entryname
-ufo2 string UFO features
-fformat2 string Features format
-fopenfile2 string Features file name
"-outfile" associated qualifiers
-odirectory3 string Output directory
General qualifiers:
-auto boolean Turn off prompts
-stdout boolean Write first file to standard output
-filter boolean Read first file from standard input, write
first file to standard output
-options boolean Prompt for standard and additional values
-debug boolean Write debug output to program.dbg
-verbose boolean Report some/full command line options
-help boolean Report command line options and exit. More
information on associated and general
qualifiers can be found with -help -verbose
-warning boolean Report warnings
-error boolean Report errors
-fatal boolean Report fatal errors
-die boolean Report dying program messages
-version boolean Report version number and exit
|
| Qualifier | Type | Description | Allowed values | Default | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Standard (Mandatory) qualifiers | ||||||||||
| [-asequence] (Parameter 1) |
sequence | Nucleotide sequence filename and optional format, or reference (input USA) | Readable sequence | Required | ||||||
| [-bsequence] (Parameter 2) |
seqall | Genomic sequence | Readable sequence(s) | Required | ||||||
| [-outfile] (Parameter 3) |
outfile | Sim4 program output file | Output file | <*>.esim4 | ||||||
| Additional (Optional) qualifiers | ||||||||||
| -word | integer | Sets the word size (W) for blast hits | Any integer value | 12 | ||||||
| -extend | integer | Sets the word extension termination limit (X) for the blast-like stage of the algorithm. | Any integer value | 12 | ||||||
| -cutoff | integer | Sets the cutoff (E) in range [3,10]. | Integer up to 10 | 3 | ||||||
| -useramsp | toggle | False: esim4 calculates mspA, True: value from mspA command line argument. | Toggle value Yes/No | No | ||||||
| -amsp | integer | MSP score threshold (K) for the first stage of the algorithm. (If this option is not specified, the threshold is computed from the lengths of the sequences, using statistical criteria.) For example, a good value for genomic sequences in the range of a few hundred Kb is 16. To avoid spurious matches, however, a larger value may be needed for longer sequences. | Any integer value | 16 | ||||||
| -userbmsp | toggle | False: esim4 calculates mspB, True: value from mspB command line argument. | Toggle value Yes/No | No | ||||||
| -bmsp | integer | Sets the threshold for the MSP scores (C) when aligning the as-yet-unmatched fragments, during the second stage of the algorithm. By default, the smaller of the constant 12 and a statistics-based threshold is chosen. | Any integer value | 12 | ||||||
| -weight | integer | Weight value (H) (undocumented). 0 uses a default, >0 is a value | Any integer value | 0 | ||||||
| -diagonal | integer | Bound (K) for the diagonal distance within consecutive MSPs in an exon. | Any integer value | 10 | ||||||
| -strand | list | This determines the strand of the genome (R) with which the mRNA will be aligned. The default value is 'both', in which case both strands of the genome are attempted. The other allowed modes are 'forward' and 'reverse'. |
|
both | ||||||
| -format | integer | Sets the output format (A). Exon endpoints only (format=0), exon endpoints and boundaries of the coding region (CDS) in the genomic sequence, when specified for the input mRNA (-format=5), alignment text (-format=1), alignment in lav-block format (-format=2), or both exon endpoints and alignment text (-format=3 or -format=4). If a reverse complement match is found, -format=0,1,2,3,5 will give its position in the plus strand of the longer sequence and the minus strand of the shorter sequence. -format=4 will give its position in the plus strand of the first sequence (mRNA) and the minus strand of the second sequence (genome), regardless of which sequence is longer. The -format=5 option can be used with the S command line option to specify the endpoints of the CDS in the mRNA, and produces output in the exons file format required by PipMaker. | Integer from 0 to 5 | 0 | ||||||
| -cliptails | boolean | Trim poly-A tails (P). Specifies whether or not the program should report the fragment of the alignment containing the poly-A tail (if found). By default (-nocliptails) the alignment is displayed as computed. When this feature is enabled (-cliptails), sim4 will remove the poly-A tails and all format options will produce additional lav alignment headers. | Boolean value Yes/No | No | ||||||
| -smallexons | boolean | Requests an additional search for small marginal exons (N) (N=1) guided by the splice-site recognition signals. This option can be used when a high accuracy match is expected. The default value is N=0, specifying no additional search. | Boolean value Yes/No | No | ||||||
| -[no]ambiguity | boolean | Controls the set of characters allowed in the input sequences (B). By default (-ambiguity), ambiguity characters (ABCDGHKMNRSTVWXY) are allowed. By specifying -noambiguity, the set of acceptable characters is restricted to A,C,G,T,N and X only. | Boolean value Yes/No | Yes | ||||||
| -cdsregion | string | Allows the user to specify the endpoints of the CDS in the input mRNA (S), with the syntax: -cdsregion=n1..n2. This option is only available with the -format=5 flag, which produces output in the format required by PipMaker. Alternatively, the CDS coordinates could appear in a construct CDS=n1..n2 in the FastA header of the mRNA sequence. When the second file is an mRNA database, the command line specification for the CDS will apply to the first sequence in the file only. | Any string | |||||||
| -aoffset | integer | Undocumented (f) - some sort of offset in first sequence. | Any integer value | 0 | ||||||
| -boffset | integer | Undocumented (F) - some sort of offset in second sequence. | Any integer value | 0 | ||||||
| -toa | integer | Undocumented (t)- offset end of first sequence? | Any integer value | 0 | ||||||
| -tob | integer | Undocumented (T) - offset end of second sequence? | Any integer value | 0 | ||||||
| Advanced (Unprompted) qualifiers | ||||||||||
| (none) | ||||||||||
| Associated qualifiers | ||||||||||
| "-asequence" associated sequence qualifiers | ||||||||||
| -sbegin1 -sbegin_asequence |
integer | Start of the sequence to be used | Any integer value | 0 | ||||||
| -send1 -send_asequence |
integer | End of the sequence to be used | Any integer value | 0 | ||||||
| -sreverse1 -sreverse_asequence |
boolean | Reverse (if DNA) | Boolean value Yes/No | N | ||||||
| -sask1 -sask_asequence |
boolean | Ask for begin/end/reverse | Boolean value Yes/No | N | ||||||
| -snucleotide1 -snucleotide_asequence |
boolean | Sequence is nucleotide | Boolean value Yes/No | N | ||||||
| -sprotein1 -sprotein_asequence |
boolean | Sequence is protein | Boolean value Yes/No | N | ||||||
| -slower1 -slower_asequence |
boolean | Make lower case | Boolean value Yes/No | N | ||||||
| -supper1 -supper_asequence |
boolean | Make upper case | Boolean value Yes/No | N | ||||||
| -scircular1 -scircular_asequence |
boolean | Sequence is circular | Boolean value Yes/No | N | ||||||
| -squick1 -squick_asequence |
boolean | Read id and sequence only | Boolean value Yes/No | N | ||||||
| -sformat1 -sformat_asequence |
string | Input sequence format | Any string | |||||||
| -iquery1 -iquery_asequence |
string | Input query fields or ID list | Any string | |||||||
| -ioffset1 -ioffset_asequence |
integer | Input start position offset | Any integer value | 0 | ||||||
| -sdbname1 -sdbname_asequence |
string | Database name | Any string | |||||||
| -sid1 -sid_asequence |
string | Entryname | Any string | |||||||
| -ufo1 -ufo_asequence |
string | UFO features | Any string | |||||||
| -fformat1 -fformat_asequence |
string | Features format | Any string | |||||||
| -fopenfile1 -fopenfile_asequence |
string | Features file name | Any string | |||||||
| "-bsequence" associated seqall qualifiers | ||||||||||
| -sbegin2 -sbegin_bsequence |
integer | Start of each sequence to be used | Any integer value | 0 | ||||||
| -send2 -send_bsequence |
integer | End of each sequence to be used | Any integer value | 0 | ||||||
| -sreverse2 -sreverse_bsequence |
boolean | Reverse (if DNA) | Boolean value Yes/No | N | ||||||
| -sask2 -sask_bsequence |
boolean | Ask for begin/end/reverse | Boolean value Yes/No | N | ||||||
| -snucleotide2 -snucleotide_bsequence |
boolean | Sequence is nucleotide | Boolean value Yes/No | N | ||||||
| -sprotein2 -sprotein_bsequence |
boolean | Sequence is protein | Boolean value Yes/No | N | ||||||
| -slower2 -slower_bsequence |
boolean | Make lower case | Boolean value Yes/No | N | ||||||
| -supper2 -supper_bsequence |
boolean | Make upper case | Boolean value Yes/No | N | ||||||
| -scircular2 -scircular_bsequence |
boolean | Sequence is circular | Boolean value Yes/No | N | ||||||
| -squick2 -squick_bsequence |
boolean | Read id and sequence only | Boolean value Yes/No | N | ||||||
| -sformat2 -sformat_bsequence |
string | Input sequence format | Any string | |||||||
| -iquery2 -iquery_bsequence |
string | Input query fields or ID list | Any string | |||||||
| -ioffset2 -ioffset_bsequence |
integer | Input start position offset | Any integer value | 0 | ||||||
| -sdbname2 -sdbname_bsequence |
string | Database name | Any string | |||||||
| -sid2 -sid_bsequence |
string | Entryname | Any string | |||||||
| -ufo2 -ufo_bsequence |
string | UFO features | Any string | |||||||
| -fformat2 -fformat_bsequence |
string | Features format | Any string | |||||||
| -fopenfile2 -fopenfile_bsequence |
string | Features file name | Any string | |||||||
| "-outfile" associated outfile qualifiers | ||||||||||
| -odirectory3 -odirectory_outfile |
string | Output directory | Any string | |||||||
| General qualifiers | ||||||||||
| -auto | boolean | Turn off prompts | Boolean value Yes/No | N | ||||||
| -stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | ||||||
| -filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | ||||||
| -options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | ||||||
| -debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | ||||||
| -verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | ||||||
| -help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | ||||||
| -warning | boolean | Report warnings | Boolean value Yes/No | Y | ||||||
| -error | boolean | Report errors | Boolean value Yes/No | Y | ||||||
| -fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | ||||||
| -die | boolean | Report dying program messages | Boolean value Yes/No | Y | ||||||
| -version | boolean | Report version number and exit | Boolean value Yes/No | N | ||||||
ID H45989; SV 1; linear; mRNA; EST; HUM; 495 BP.
XX
AC H45989;
XX
DT 18-NOV-1995 (Rel. 45, Created)
DT 04-MAR-2000 (Rel. 63, Last updated, Version 2)
XX
DE yo13c02.s1 Soares adult brain N2b5HB55Y Homo sapiens cDNA clone
DE IMAGE:177794 3', mRNA sequence.
XX
KW EST.
XX
OS Homo sapiens (human)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC Homo.
XX
RN [1]
RP 1-495
RA Hillier L., Clark N., Dubuque T., Elliston K., Hawkins M., Holman M.,
RA Hultman M., Kucaba T., Le M., Lennon G., Marra M., Parsons J., Rifkin L.,
RA Rohlfing T., Soares M., Tan F., Trevaskis E., Waterston R., Williamson A.,
RA Wohldmann P., Wilson R.;
RT "The WashU-Merck EST Project";
RL Unpublished.
XX
DR GDB; 3839990.
DR GDB; 4193257.
DR UNILIB; 555; 300.
XX
CC On May 8, 1995 this sequence version replaced gi:800819.
CC Contact: Wilson RK
CC Washington University School of Medicine
CC 4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108
CC Tel: 314 286 1800
CC Fax: 314 286 1810
CC Email: est@watson.wustl.edu
CC Insert Size: 544
CC High quality sequence stops: 265
CC Source: IMAGE Consortium, LLNL
CC This clone is available royalty-free through LLNL ; contact the
CC IMAGE Consortium (info@image.llnl.gov) for further information.
CC Possible reversed clone: polyT not found
CC Insert Length: 544 Std Error: 0.00
CC Seq primer: SP6
CC High quality sequence stop: 265.
XX
FH Key Location/Qualifiers
FH
FT source 1..495
FT /organism="Homo sapiens"
FT /lab_host="DH10B (ampicillin resistant)"
FT /mol_type="mRNA"
FT /sex="Male"
FT /dev_stage="55-year old"
FT /clone_lib="Soares adult brain N2b5HB55Y"
FT /clone="IMAGE:177794"
FT /note="Organ: brain; Vector: pT7T3D (Pharmacia) with a
FT modified polylinker; Site_1: Not I; Site_2: Eco RI; 1st
FT strand cDNA was primed with a Not I - oligo(dT) primer [5'
FT TGTTACCAATCTGAAGTGGGAGCGGCCGCGCTTTTTTTTTTTTTTTTTTT 3'],
FT double-stranded cDNA was size selected, ligated to Eco RI
FT adapters (Pharmacia), digested with Not I and cloned into
FT the Not I and Eco RI sites of a modified pT7T3 vector
FT (Pharmacia). Library went through one round of
FT normalization to a Cot = 53. Library constructed by Bento
FT Soares and M.Fatima Bonaldo. The adult brain RNA was
FT provided by Dr. Donald H. Gilden. Tissue was acquired 17-18
FT hours after death which occurred in consequence of a
FT ruptured aortic aneurysm. RNA was prepared from a pool of
FT tissues representing the following areas of the brain:
FT frontal, parietal, temporal and occipital cortex from the
FT left and right hemispheres, subcortical white matter, basal
FT ganglia, thalamus, cerebellum, midbrain, pons and medulla."
FT /db_xref="taxon:9606"
FT /db_xref="UNILIB:555"
XX
SQ Sequence 495 BP; 73 A; 135 C; 169 G; 104 T; 14 other;
ccggnaagct cancttggac caccgactct cgantgnntc gccgcgggag ccggntggan 60
aacctgagcg ggactggnag aaggagcaga gggaggcagc acccggcgtg acggnagtgt 120
gtggggcact caggccttcc gcagtgtcat ctgccacacg gaaggcacgg ccacgggcag 180
gggggtctat gatcttctgc atgcccagct ggcatggccc cacgtagagt ggnntggcgt 240
ctcggtgctg gtcagcgaca cgttgtcctg gctgggcagg tccagctccc ggaggacctg 300
gggcttcagc ttcccgtagc gctggctgca gtgacggatg ctcttgcgct gccatttctg 360
ggtgctgtca ctgtccttgc tcactccaaa ccagttcggc ggtccccctg cggatggtct 420
gtgttgatgg acgtttgggc tttgcagcac cggccgccga gttcatggtn gggtnaagag 480
atttgggttt tttcn 495
//
|
ID Z69719; SV 1; linear; genomic DNA; STD; HUM; 33760 BP.
XX
AC Z69719;
XX
DT 26-FEB-1996 (Rel. 46, Created)
DT 13-JAN-2009 (Rel. 99, Last updated, Version 7)
XX
DE Human DNA sequence from clone XX-CNFG9 on chromosome 16
XX
KW C16orf33; HTG; POLR3K; RHBDF1.
XX
OS Homo sapiens (human)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC Homo.
XX
RN [1]
RP 1-33760
RA Kershaw J.;
RT ;
RL Submitted (09-JAN-2009) to the INSDC.
RL Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.
RL E-mail enquiries: vega@sanger.ac.uk Clone requests: Geneservice
RL (http://www.geneservice.co.uk/) and BACPAC Resources
RL (http://bacpac.chori.org/)
XX
DR EMBL-CON; GL000124.
DR Ensembl-Gn; ENSG00000007384; Homo_sapiens.
DR Ensembl-Gn; ENSG00000161980; Homo_sapiens.
DR Ensembl-Gn; ENSG00000161981; Homo_sapiens.
DR Ensembl-Tr; ENST00000262316; Homo_sapiens.
DR Ensembl-Tr; ENST00000293860; Homo_sapiens.
DR Ensembl-Tr; ENST00000293861; Homo_sapiens.
DR Ensembl-Tr; ENST00000338527; Homo_sapiens.
DR Ensembl-Tr; ENST00000383018; Homo_sapiens.
DR Ensembl-Tr; ENST00000417043; Homo_sapiens.
DR Ensembl-Tr; ENST00000417493; Homo_sapiens.
DR Ensembl-Tr; ENST00000419764; Homo_sapiens.
DR Ensembl-Tr; ENST00000420545; Homo_sapiens.
DR Ensembl-Tr; ENST00000428730; Homo_sapiens.
DR Ensembl-Tr; ENST00000448893; Homo_sapiens.
DR Ensembl-Tr; ENST00000450643; Homo_sapiens.
DR Ensembl-Tr; ENST00000454039; Homo_sapiens.
DR GDB; 11502921.
DR GOA; E9PGJ1.
DR GOA; F5GWL4.
DR GOA; F8WBS4.
DR GOA; H0Y6L9.
DR InterPro; IPR022241; Rhomboid_SP.
DR InterPro; IPR022764; Peptidase_S54_rhomboid_dom.
[Part of this file has been deleted for brevity]
gagacagcag agtgctcagc tcatgaagga ggcaccagcc gccatgcctc tacatccagg 30840
tctcctgggg ttcccacctc cacaaaaacc cccactgcta ggagtgcagg caggagggga 30900
cctgagaacc gacagttata ggtcctgcgg gtgggcagtg ctgggtgttc tggtctgccc 30960
cacccctgtg tgcctagatc cccatctggg cctcaagtgg gtgggattcc aaaggaagag 31020
ccggagtagg cgtggggagg ggcaggccca ggctggacaa agagtctggc cagggagcgg 31080
cacattgccc tcccagagac agtggctcag tgtccaggcc ttccccaggc gcacagtggg 31140
ctcttgttcc cagaaagccc ctcgggggga tccaaacagt gtctccccca ccccgctgac 31200
ccctcagtgt atggggaaac cgtggcccac ggaaggcctc actgcctggg gtcacacagc 31260
atctgagtca ctgcagcagc ctcacagctg ccagcccagg cccagcccca tcaggagaca 31320
cccaaagcca cagtgcatcc caggaccagc tgggggggct gcgggcagga ctctcgatga 31380
ggctgaggga cgaggagggt caagggagcc actggcgcca tgcatgctga cgtcccctct 31440
ggctgcctgc agagcctggt gtggaagggc tgagtggggg atggtggaga gtcctgttaa 31500
ctcaggtttc tgctctgggg atgtctgggc acccatcaag ctggccgcgt gcacaggtgc 31560
agggagagcc agaaagcagg agccgatgca gggaggccac tggggacagc ccaggctgat 31620
gcttgggccc catgtgtctc caccacctac aaccctaagc aagcctcagc tttcccatct 31680
ggaaatcagg ggtcacagca gtgcctggca cagtagcagc ggctgactcc atcacagggt 31740
ggtgtagcct gtgggtactt ggcactctct gaggggcagg agctgggggg tgaaaggacc 31800
ctagagcata tgcaacaaga gggcagccct ggggacacct ggggacagaa ccctccaaag 31860
gtgtcgagtt tgggaagaga ctagagagaa gctctggcca gtccaggcat agacagtggc 31920
cacagccagt ggagagctgc atcctcaggt gtgagcagca accacctctg tactcaggcc 31980
tgccctgcac actcacagga ccatgctggc agggacaact ggcggcggag ttgactgcca 32040
accccggggc cagaaccatc aagcctgggc tctgctccgc ccaaggaact gcctgctgcc 32100
gaggtcagct ggagcaaggg gcctcacccc gggacacctt cccagacgtg tcctcagctc 32160
acatgagcct catcccaggg ggatgtggct cctccagcat ccccacccac acgctgctct 32220
ctgaccctca gtcttctgtt tgactcctaa tctgaagctc aatcctagat ctcccttgag 32280
aagggggtca ccagctgtct ggcagcccag cctccaggtc ttctggatta atgaagggaa 32340
agtcacctgg cctctctgcc ttgtctatta atggcatcat gctgagaatg atatttgcta 32400
ggccctttgc aaaccccaaa gtgctcttca accctcccag tgaagcctct tcttttctgt 32460
ggaagaaatg aggttcaggg tggagcaggg caggcctgag acctttgcag ggttctctcc 32520
aggtccccag caggacagac tggcaccctg cctcccctca tcaccctaga caaggagaca 32580
gaacaagagg ttccctgcta caggccatct gtgagggaag ccgccctagg gcctgtagac 32640
acaggaatcc ctgaggacct gacctgtgag ggtagtgcac aaaggggcca gcacttggca 32700
ggaggggggg gggcactgcc ccaaggctca gctagcaaat gtggcacagg ggtcaccaga 32760
gctaaacccc tgactcagtt gggtctgaca ggggctgaca tggcagacac acccaggaat 32820
caggggacac caagtgcagc tcagggcacc tgtccaggcc acacagtcag aaaggggatg 32880
gcagcaagga cttagctaca ctagattctg ggggtaaact gcctggtatg ctggtcactg 32940
ctagtcccca gtctggagtc tagctgggtc tcaggagtta ggcgaaaaca ccctccccag 33000
gctgcaggtg ggagaggccc acatcccctg cacacgtctg gccagaggac agatgggcag 33060
cccagtcacc agtcagagcc ctccagaggt gtccctgact gaccctacac acatgcaccc 33120
aggtgcccag gcacccttgg gctcagcaac cctgcaaccc cctcccagga cccaccagaa 33180
gcaggatagg actagagagg ccacaggagg gaaaccaagt cagagcagaa atggcttcgg 33240
tcctcagcag cctggctcag cttcctcaaa ccagatcctg actgatcaca ctggtctgtc 33300
taacccctgg gaggggtcct ctgtatccat cttacagata aggaaactga ggctcagaga 33360
agcccatcac tgcctaaggt cccagggcct ataagggagc tcaaagcctt gggccaggtc 33420
tgcccaggag ctgcagtgga agggaccctg tctgcagacc cccagaagac aaggcagacc 33480
acctgggttc ttcagccttg tggctgtgga cggctgtcag acccttctaa gaccccttgc 33540
cacctgctcc atcaggggca tctcagttga agaaggaagg actcaccccc aaaatcgtcc 33600
aactcagaaa aaaaggcaga agccaaggaa tccaatcact gggcaaaatg tgatcctggc 33660
acagacactg aggtggggga actggagccg gtgtggcgga ggccctcaca gccaagagca 33720
actgggggtg ccctgggcag ggactgtagc tgggaagatc 33760
//
|
seq1 = H45989, 495 bp seq2 = Z69719 (Z69719), 33760 bp 1-193 (25686-25874) 92% <- 194-407 (26279-26492) 98% <- 408-487 (27391-27469) 88% |
| Program name | Description |
|---|---|
| est2genome | Align EST sequences to genomic DNA sequence |
| needle | Needleman-Wunsch global alignment of two sequences |
| needleall | Many-to-many pairwise alignments of two sequence sets |
| stretcher | Needleman-Wunsch rapid global alignment of two sequences |
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.
None