EMBOSS: Project Meeting (Oct 4th 1999)


Sanger Centre: Peter Rice, Ian Longden
HGMP: Alan Bleasby, Gary Williams, Mark Faller, Val Curwen

1. Matters Arising

2. General progress on release 0.0.4

Renaming variables "this" and "bool" to "ajthis" and "ajbool" was agreed. Alan is still looking through the David Mathog changes. If we update the Henry Spencer regex libraries we need to pass the changes back to the maintainers.

Peter plans to split ajseq.c which is getting too large to comfortably edit.

Peter has revised and rewritten the code for begin and end positions to check in all cases. This should not require any change to the way applications are currently written, but authors should test to make sure that they see the expected results.

Peter has revised sequence character processing to change gap characters to '-' on input, and to change gap characters to the appropriate codes for each sequence format on output. This was done for some formats before, but not for all.

The ".dbg" output file name for the "-debug" option could be expanded to ".debug". Not urgent.

Peter has changed the timing of the "-help" output processing so that "-h" is recognized as a command line option before the help output is written. This requires "help" to be the first command line qualifier defined, which was already the case.

Gary would like code to iterate through a feature table for feature processing.

3. Beta Release

Peter has added sequence types to all programs that need them. Where programs can read DNA or protein, but need two sequences with the same type, the type of the second sequence now depends on the first.

Authors should test their programs to make sure that the specified types are correct.

There are cases where two sequences have qualifiers "sequencea" and "sequenceb" but this has the unfortunate side effect that there is no short qualifier name available. Nobody has noticed because in all cases these are also parameters (can be used without a qualifier) but there should be some standard such as "sequence" and "bsequence" to allow "-seq" and "-bseq" on the command line.

There are many ACD files where integers (and floats) should have a minimum value specified. Authors should check their code and ACD files and add suitable limits to the ACD files.

Sequence prompts are not very clear for programs which read one, all, or a stream of sequences. New prompts are needed to make each type of sequence input clear.

4. PLPLOT Graphics Library

Ian has updates for PLPLOT to support PNG format, but cannot yet automake them. They use three additional libraries (gd, png and compression) that must be installed first, and tested for before building.

Ian has added a "data" device driver to dump the data values to a file.

Ian has added a message to tell the user where postscript output has been written.

The default graphics output device is hard coded in ACD files at present. There should be a variable in emboss.defaults to control this, which the ACD processing can use as a default value, and the values in the ACD files should then be removed so we have a global default.

5. Any Other Business

Ian and Peter are investigating Will Gilbert's MSE editor as a possible EMBOSS application. It is written in C, but for VMS and makes use of the VMS 'replacement' for curses.

6. Next meeting

Next meeting Monday 11th October, 11:00am, usual place.

Peter Rice, Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton Hall, Cambridge, CB10 1SA, UK.