EMBOSS: Project Meeting (Apr 17th 2000) |
Output files can cause problems when they are not written to, or may not be needed. Peter proposed changes to the output file ACD attributes and qualifiers to work around these cases, with an attribute to say a file is not required, and a qualifier to say it should be created anyway to overwrite a possible older version.
Alan is implementing fuzztran to find patterns in translated DNA sequences.
Sequence features can be read in most formats: GFF, EMBL, GenBank and SwissProt are implemented. In some cases the results are different after converting to another format and back, for example "/note" qualifier values may not split correctly. EMBL to GFF creates separate features, and need a sequence tag name to regenerate the full join. All details are output for the last entry of a join in GFF.
HGMP have some additional utility programs not yet committed (special FASTA file formats etc.)
Additional command line qualifiers could be useful for NCBI format FASTA files, where the order of items on the header line can be not what the parser expects by default. There is an example in dbSTS which Peter and Alan will investigate.
Alan would like functions to read all files in a wildcard list. Peter will try to use ajFileNewInList to do this.
Peter will work on indexing PIR format databases for the Netherlands EMBnet node.
Peter has implemented an ajSeqMakeUsa function in an ajtest application to check on generation of USAs from sequence objects.
PLPLOT has a new web site and a reactivated mailing list.
The next two Mondays are public holidays. Next meeting on Monday 8th May, 11:00am, usual place.