EMBOSS: Project Meeting (Mar 22nd 1999)


Sanger Centre: Peter Rice, Ian Longden, Ewan Birney
HGMP: Alan Bleasby, Val Curwen, Mark Faller, Sinead O'Leary, Gary Williams
EBI: Rob Andrews, Yevgeny ...
Apologies: Rodrigo Lopez, Martin Senger, Richard Bruskiewich

1. Matters Arising

See below.

2. ACD extensions

Peter has changed the "parameter" attribute to be a boolean value, although integers are accepted as "true" so that current ACD files will still work.

Peter has also implemented the "documentation" attribute for applications. A warning is issued if this is missing, but the program will run normally.

Peter has also implemented a first attempt at "-help" output, based on the format in last week's minutes. There is a new "help" attribute as a default for all data types. This is used as the help text when the application usage is printed.

A problem was found in the command line syntax generated from the ACD file where more than one set of associated qualifiers appears with the same qualifier names. This causes trouble for normal qualifiers but not for parameters.

Peter raised this issue on the emboss-dev mailing list. Peter's proposed solution was to number associated qualifiers "1" for the first occurrence, "2" for the second, and so on regardless of the parameter number (if any) of the main qualifier.

Ewan proposed an alternative based on the Wise2 command line. Each object on the command line (equivalent to a sequence in ACD terms) has a prefix which is added to any associated qualifiers. In Wise2 the prefix is a single letter, so that a sequence could have "-tformat".

After discussion, a consensus emerged to try Ewan's prefix method, but with an underscore after the prefix. The prefix could be generated automatically in most cases (usually as the first letter of the main qualifier), but could also be a default attribute "prefix:" to allow exact matches so that sequences could always have the same prefix.

Peter pointed out that ACD processing could use both methods, so he will implement the renumbering first to resolve the current problems and then look into a prefix for each associated qualifier. The preferred syntax will use the prefix, and will be reported by "-help".

The "-help" output has a problem with derived values because most of these cannot be resolved at the stage where "-help" output is needed (for example dependencies on sequence length). Peter proposed putting any of this information into the help text explicitly.

Peter has looked into implementing "region" as an ACD type for Gary's sequence manipulation programs. As this is very similar to defining features it was postponed until GFF and other feature definitions have been included. The application will continue to use a string for the region.

Gary is still looking into options for translation tables.

3. Library Documentation

Peter is still working on library documentation.

4. Graph Output

The graphics programs are not using the graph data type in ACD files. This still needs to be merged with the graphics options implemented to date. Ian and Peter will work on this.

A right axis option for XY graphs is needed.

Background colour options have been changed to black or white only to avoid problems in the PLplot library.

5. General progress on release 0.0.4

Alan is still testing CVS workarounds for the PLplot binary file problem. Ewan suggested using the "-k" option to change the handling of binary files. This controls in-place editing checks which can be confused by data in binary files.

Peter raised the issue of how best to make the distribution using automake. Ian explained that automake adds known source files, but that other file types, data files, etc. were added by explicitly copying them. Ewan uses a full CVS export to make distributions, but this does not create the configure file. Ian and Peter will look into alternatives to avoid the CVS directories appearing in the distribution.

6. Wise and EMBOSS

Ewan would like to link Wise2 applications with the AJAX library to use the additional sequence reading functions, but he has concerns about linking automatically to the graphics library at the same time.

Ewan proposed separating the graphics functions (ajgraph and ajhist) from the rest of AJAX. As the ajacd source defines graph objects this may need to be separated too.

Peter offered to try separating as much of the library as possible to get a simple ajSeqRead call to work, initially ajseq, ajstr and ajfile, but this may include most of the other AJAX source files to fill in other essential functions.

Ewan argued in favour of multiple smaller libraries, if the dependencies can be resolved.

Ewan also recommended the GNU C library for some extra utilities, similar to those in AJAX.

Peter and Ewan agreed that the GNU licensing allows Ewan to borrow any library code for Wise2, and vice versa for Wise2's "dynlibsrc" library code to be reused in EMBOSS, but that there were clear benefits for both projects if Wise2 could link to AJAX and Ewan could then contribute code directly to AJAX.


Rob asked about what SRS needs to do for EMBOSS. Peter suggested waiting for the new SRS release and looking into automatically generating Icarus application definitions. This may be possible with a new qualifier in the ACD processing, or it may need some extensions to the Icarus application definitions to make them closer to ACD conventions, especially for dependencies.

8. Next meeting

Next meeting Monday 29th March, usual time and place.
Peter Rice, Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton Hall, Cambridge, CB10 1SA, UK.