SEQWORDS documentation |
TY SCOP XX CL Alpha and beta proteins (a/b) XX FO NAD(P)-binding Rossmann-fold domains XX SF NAD(P)-binding Rossmann-fold domains XX FA Lactate & malate dehydrogenases, N-terminal domain XX TE NAD(P)-binding Rossmann-fold TE Lactate & malate dehydrogenases TE Lactate dehydrogenase TE Malate dehydrogenase // |
ID ACEA_ECOLI STANDARD; PRT; 434 AA. AC P05313; DT 01-NOV-1988 (Rel. 09, Created) DT 01-NOV-1988 (Rel. 09, Last sequence update) DT 15-DEC-1998 (Rel. 37, Last annotation update) DE ISOCITRATE LYASE (EC 4.1.3.1) (ISOCITRASE) (ISOCITRATASE) (ICL). GN ACEA OR ICL. OS Escherichia coli. OC Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; OC Escherichia. RN [1] RP SEQUENCE FROM N.A. RC STRAIN=K12; RX MEDLINE; 89083515. RA Byrne C.R., Stokes H.W., Ward K.A.; RT "Nucleotide sequence of the aceB gene encoding malate synthase A in RT Escherichia coli."; RL Nucleic Acids Res. 16:10924-10924(1988). RN [2] RP SEQUENCE FROM N.A. RC STRAIN=K12; RX MEDLINE; 88262573. RA Rieul C., Bleicher F., Duclos B., Cortay J.-C., Cozzone A.J.; RT "Nucleotide sequence of the aceA gene coding for isocitrate lyase in RT Escherichia coli."; RL Nucleic Acids Res. 16:5689-5689(1988). RN [3] RP SEQUENCE FROM N.A. RX MEDLINE; 89008064. RA Matsuoka M., McFadden B.A.; RT "Isolation, hyperexpression, and sequencing of the aceA gene encoding RT isocitrate lyase in Escherichia coli."; RL J. Bacteriol. 170:4528-4536(1988). RN [4] RP SEQUENCE FROM N.A. RC STRAIN=K12 / MG1655; RX MEDLINE; 94089392. RA Blattner F.R., Burland V.D., Plunkett G. III, Sofia H.J., RA Daniels D.L.; RT "Analysis of the Escherichia coli genome. IV. DNA sequence of the RT region from 89.2 to 92.8 minutes."; RL Nucleic Acids Res. 21:5408-5417(1993). RN [5] RP SEQUENCE OF 293-434 FROM N.A. RX MEDLINE; 88227861. RA Klumpp D.J., Plank D.W., Bowdin L.J., Stueland C.S., Chung T., RA Laporte D.C.; RT "Nucleotide sequence of aceK, the gene encoding isocitrate RT dehydrogenase kinase/phosphatase."; RL J. Bacteriol. 170:2763-2769(1988). [Part of this file has been deleted for brevity] FT CONFLICT 70 70 A -> R (IN REF. 2). FT CONFLICT 80 80 A -> R (IN REF. 1 AND 2). FT CONFLICT 116 116 I -> N (IN REF. 2). FT CONFLICT 144 144 F -> L (IN REF. 1). FT CONFLICT 305 312 LGEEFVNK -> WAKSSLISN (IN REF. 2). FT CONFLICT 307 307 E -> Q (IN REF. 1). FT STRAND 2 6 FT TURN 7 9 FT HELIX 11 23 FT TURN 26 27 FT STRAND 28 33 FT TURN 37 38 FT HELIX 39 47 FT TURN 48 48 FT STRAND 53 58 FT HELIX 64 67 FT TURN 68 69 FT STRAND 72 75 FT TURN 83 84 FT HELIX 87 108 FT TURN 110 111 FT STRAND 113 116 FT HELIX 121 134 FT TURN 135 136 FT TURN 140 141 FT STRAND 143 145 FT HELIX 148 162 FT TURN 163 163 FT HELIX 166 168 FT STRAND 173 175 FT TURN 179 181 FT STRAND 182 184 FT HELIX 186 188 FT TURN 190 191 FT HELIX 196 217 FT TURN 218 219 FT HELIX 225 242 FT TURN 243 244 FT STRAND 248 255 FT STRAND 263 271 FT TURN 272 273 FT STRAND 274 278 FT HELIX 286 311 SQ SEQUENCE 312 AA; 32337 MW; 17741A3B5AD068BA CRC64; MKVAVLGAAG GIGQALALLL KTQLPSGSEL SLYDIAPVTP GVAVDLSHIP TAVKIKGFSG EDATPALEGA DVVLISAGVA RKPGMDRSDL FNVNAGIVKN LVQQVAKTCP KACIGIITNP VNTTVAIAAE VLKKAGVYDK NKLFGVTTLD IIRSNTFVAE LKGKQPGEVE VPVIGGHSGV TILPLLSQVP GVSFTEQEVA DLTKRIQNAG TEVVEAKAGG GSATLSMGQA AARFGLSLVR ALQGEQGVVE CAYVEGDGQY ARFFSQPLLL GKNGVEERKS IGTLSAFEQN ALEGMLDTLK KDIALGEEFV NK // |
> Q60150^.^1^312^SCOP^.^0^Alpha and beta proteins (a/b)^.^.^NAD(P)-binding Rossmann-fold domains^NAD(P)-binding Rossmann-fold domains^Lactate & malate dehydrogenases, N-terminal domain^KEYWORD^0.00^0.000e+00^0.000e+00 MKVAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGEDATPALEGADVVLISAGVARKPGMDRSDLFNVNAGIVKNLVQQVAKTCPKACIGIITNPVNTTVAIAAEVLKKAGVYDKNKLFGVTTLDIIRSNTFVAELKGKQPGEVEVPVIGGHSGVTILPLLSQVPGVSFTEQEVADLTKRIQNAGTEVVEAKAGGGSATLSMGQAAARFGLSLVRALQGEQGVVECAYVEGDGQYARFFSQPLLLGKNGVEERKSIGTLSAFEQNALEGMLDTLKKDIALGEEFVNK |
Standard (Mandatory) qualifiers: [-keyfile] infile This option specifies the name of keywords file (input). This contains a list of keywords specific to a number of SCOP or CATH families and superfamilies used by SEQWORDS to search a sequence database. [-spfile] infile This option specifies the name of the sequence database (input) to search. [-outfile] outfile [test.hits] This option specifies the name of the DHF file (domain hits file) (output). A 'domain hits file' contains database hits (sequences) with domain classification information, in the DHF format (FASTA-like). The hits are relatives to a SCOP or CATH family (or other node in the structural hierarchies) and are found from a search of a sequence database. Files containing hits retrieved by PSIBLAST are generated by using SEQSEARCH, hits retrieved by a sparse protein signatare by using SIGSCAN or various types of HMM and profile by using LIBSCAN. Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: (none) Associated qualifiers: "-outfile" associated qualifiers -odirectory3 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write first file to standard output -filter boolean Read first file from standard input, write first file to standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messages |
Standard (Mandatory) qualifiers | Allowed values | Default | |
---|---|---|---|
[-keyfile] (Parameter 1) |
This option specifies the name of keywords file (input). This contains a list of keywords specific to a number of SCOP or CATH families and superfamilies used by SEQWORDS to search a sequence database. | Input file | Required |
[-spfile] (Parameter 2) |
This option specifies the name of the sequence database (input) to search. | Input file | Required |
[-outfile] (Parameter 3) |
This option specifies the name of the DHF file (domain hits file) (output). A 'domain hits file' contains database hits (sequences) with domain classification information, in the DHF format (FASTA-like). The hits are relatives to a SCOP or CATH family (or other node in the structural hierarchies) and are found from a search of a sequence database. Files containing hits retrieved by PSIBLAST are generated by using SEQSEARCH, hits retrieved by a sparse protein signatare by using SIGSCAN or various types of HMM and profile by using LIBSCAN. | Output file | test.hits |
Additional (Optional) qualifiers | Allowed values | Default | |
(none) | |||
Advanced (Unprompted) qualifiers | Allowed values | Default | |
(none) |
% seqwords Generates DHF files from keyword search of UniProt. Keywords file: seqwords.terms Swissprot-format database file: seqwords.seq Domain hits output file [test.hits]: seqwords.dhf |
Go to the input files for this example
Go to the output files for this example
FILE TYPE | FORMAT | DESCRIPTION | CREATED BY | SEE ALSO |
Domain hits file | DHF format (FASTA-like). | Database hits (sequences) with domain classification information. The hits are relatives to a SCOP or CATH family (or other node in the structural hierarchies) and are found from a search of a discriminating element (e.g. a protein signature, hidden Markov model, simple frequency matrix, Gribskov profile or Hennikoff profile) against a sequence database. | SEQSEARCH (hits retrieved by PSIBLAST). SIGSCAN (hits retrieved by sparse protein signature). LIBSCAN (hits retrieved by various types of HMM and profile). | N.A. |
Keywords file | Text | Contains a list of keywords specific to a number of SCOP families and superfamilies used by SEQWORDS to search a sequence database. | N.A. | N.A. |
Program name | Description |
---|---|
contacts | Generate intra-chain CON files from CCF files |
domainalign | Generate alignments (DAF file) for nodes in a DCF file |
domainrep | Reorder DCF file to identify representative structures |
domainreso | Remove low resolution domains from a DCF file |
interface | Generate inter-chain CON files from CCF files |
libgen | Generate discriminating elements from alignments |
matgen3d | Generate a 3D-1D scoring matrix from CCF files |
psiphi | Calculates phi and psi torsion angles from protein coordinates |
rocon | Generates a hits file from comparing two DHF files |
rocplot | Performs ROC analysis on hits files |
seqalign | Extend alignments (DAF file) with sequences (DHF file) |
seqfraggle | Removes fragment sequences from DHF files |
seqsearch | Generate PSI-BLAST hits (DHF file) from a DAF file |
seqsort | Remove ambiguous classified sequences from DHF files |
siggen | Generates a sparse protein signature from an alignment |
siggenlig | Generates ligand-binding signatures from a CON file |
sigscan | Generates hits (DHF file) from a signature search |
sigscanlig | Searches ligand-signature library & writes hits (LHF file) |
See also http://emboss.sourceforge.net/