BLAST format databases are those generated using the tools distributed with NCBI-BLAST or with WU-BLAST.
For indexing of one BLAST database, move to the directory containing your BLAST format databases and run dbiblast
Index a BLAST database Database name: blastsw Database directory [.]: database base filename [blastsw]: Release number [0.0]: Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: p 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2
The program will chug along for a while and will then generate the EMBLCD index files for the BLAST format database.
The following entry (or one like it that is more appropriate to your particular installation) should be put in your
.embossrc
DB blastsw [ type: P method: blast format: ncbi dir: \$emboss_db_dir/blastsw file: "blastsw" release: "38.9" comment: "BLAST format Swissprot" ]
showdb should show your newly configured database.
Because of the way BLAST works, many sites may group their BLAST databases in the same directory. You can index these in situ with dbiblast but this may require some extra steps if your databases are not of the same type as generation of subsequent index files will overwrite those that already exist. To avoid overwriting of index files you can index many databases with one set of index files, or you can use the
indexdiroptions to place the indices in a different directory.
There are two requirements for indexing several databases together in one index. The first is that the databases are the same type (protein/nucleic acid) and generated with the same tool (pressdb or formatdb); the second is that all the ID and accession numbers in the combined databases are unique.
Run dbiblast as before but specify all the databases you wish to be included when prompted for the database filename.
Index a BLAST database Database name: alldbs Database directory [.]: database base filename [alldbs]: dbone dbtwo dbthree dbfour Release number [0.0]: Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: p 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2
These can then be configured as described in section 3.2.5 above by using the '
file:' and '
exclude:' tags as appropriate.3.5
When you have databases of different types, generated with different programs or where the ID/accession numbers are duplicated between databases the preferred strategy is probably to keep the source data for the individual databases in separate directories and index them there.3.6
Alternatively you can place the index files in a separate directory. This requires that you run dbiblast with the
-indexdirectoryoption and set the
indexdir:tag in the database configuration to point to the correct database. The example below illustrates database configuration using the
indexdiroptions.
% dbiblast -indexdir=/databases/indices/mydb Index a BLAST database Database name: mydb Database directory [.]: database base filename [mydb]: Release number [0.0]: Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: p 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2
The corresponding entry in
/.embossrc(or
emboss.default) would look like:
DB mydb [ type: P method: blast format: ncbi dir: \$emboss_db_dir/blastsw indexdir: /databases/indices/mydb file: mydb release: "1.0" comment: "My BLAST DB with an index in a different directory" ]
Again, multiple indices cannot coexist in the same directory so care should be taken when using the
indexdiroptions that an existing database index is not overwritten.