EMBOSS: Project Meeting (Mar 8th 1999) |
Peter announced the new release on the emboss-dev mailing list last week. So far the only reply has been from Rodrigo who has been having problems installing at the EBI. Martin has just picked up a copy and will check whether he can get the install to work. Peter will add the download address to the main web pages when we can confirm that other sites are able to install without problems.
There was discussion about where EMBOSS data files should be located. The present search path is:
Peter proposed extending the ACD attributed to include a "documentation" attribute for all data types, so that this could be used in building the command line syntax part of the application documentation. This would be optional at first as no ACD file has "doc:" defined. There are issues of how to document booleans, as they would normally appear on the command line as the opposite of the ACD file default.
Alan asked how often library functions were added, and how often the documentation should be printed. Peter proposed comparing the documentation each night in the SRS indexing job, and mailing differences to the emboss-dev list.
Thon has added the ACD expression extensions to the ACD document. He will add the other new features (-options and graph definitions) and make a new version.
Gary has provided documentation for 5 new applications which are included in the Applications web pages. These are cutseq, maskseq, pasteseq, revseq and transeq. The documentation style was agreed, and will be used for other applications.
Peter will try to build the command line syntax for all programs automatically from their ACD definitions (see above).
The application documentation includes a "See also" section of related applications. It was agreed to maintain this by hand for now, but to aim for an automated list. The heading should include one or more application types so that each application can be gived defined classes (e.g. restriction mapping, pattern matching, database searching) and a list of all applications in these classes can be built to replace the current manual list. In all cases the description should be the brief "Function" description from the documentation, so that the list will be similar when rebuilt automatically. The same text should appear as the "documentation:" attribute for the applications in their ACD files.
Gary proposed "r2d2" as an application name to convert RNA to DNA. Peter pointed out that this needs no extra code. A version of seqret with "type: dna" as a sequence attribute will convert the input sequence automatically, though of course it would need to be copied to a new name to be active. The original seqret handles any sequence and must not be given a defined type.
Rodrigo has not yet provided the Icarus code to update documentation over HTTP. Peter will chase this up.
Alan has added two new functions, ajSysUnlink (delete a file) and ajSysCanon (set terminal canonical mode).
Alan will next work on helixturnhelix, sigcleave and antigenic, 3 EGCG applications which use local data files.
Gary will work on further simple sequence manipulation applications. Some are already named in the "See also" lists for the ones added last week.
Sinead will work on comparing sequnece motifs to a sequence database, and comparing a sequence to a pattern database such as prosite. Peter has implemented the POSIX regular expression library from Henry Spencer as ajPosreg with the same functions as ajReg. Peter has provided a Sanger perl script to convert prosite.dat pattern lines into regular expressions.
Thon is not working on applications at present, but will be updating the ACD documentation and working with Peter on library documentation.
Val is working on AJAX library functions for sequence output with additional derived information. Peter suggested linking this with AJAX functions to write text output as text, HTML, etc. For sequences, output could also be JavaScript as used by SRS applets.
Mark is working on clustal in two parts. "eclustal" is nearly complete. "clustree" needs graph library extensions.
Mark is also working on a general "pepinfo" application to calculate protein sequence properties. Peter will try to track down 2D gel calculations for isoelectric points which have some diferences from the usual amino acid parameters.
Ian will drop his pattern matching work and concentrate on the ajGraph library with Alan.
Peter will be working on the ACD code and on library documentation.
The acedb team are looking into graphics libraries that are portable to non-Unix systems, for example GTK. Peter expressed interest, especially for map displays which could be used for sequence features, but was also concerned about licensing issues because EMBOSS libraries need to be compatible with the GNU Library license and not tied to the full GPL.
Gary proposed changing "int" to "long" throughout the string library. It was felt that as "int" data types were at least 32 bits on all currently supported platforms this was not necessary. Implementing it would cause problems with any function returning a "long" to an application (for example legacy code) that needs "int".
Rob is working in the SRS team on linking metabolic pathway databases, but will pass on to David Kreil at the EBI a request to look into the conversion possibilities of ACD into SRS 6 application definitions in Icarus, and will check on any changes in SRS 6 that could affect the use of "getz" for database access.
Martin has just released the new version of AppLab and will make a comparison between ACD syntax and the definitions he plans for AppLab which will use XML files as meta definitions of applications.
Peter will work on adding options to "ajcompile" to automatically build XML for AppLab and Icarus for SRS from ACD files.