EMBOSS: Project Meeting (Mar 1st 1999)


Sanger Centre: Peter Rice, Richard Bruskiewich
HGMP: Gary Williams, Alan Bleasby, Thon de Boer, Mark Faller, Val Curwen
Rodrigo Lopez
Apologies: Ian Longden, Sinead O'Leary

1. Matters Arising

Val Curwen has joined HGMP.

2. Progress on Release 0.0.4

HGMP are successfully using the Sanger Centre's master CVS copy.

3. ACD extensions

Peter has extended the expression syntax to include boolean expressions @(... == ...), @(... != ...), @(... > ...) @(... < ...) and has implemented nested expressions. The latter was trivial, changing an "if" to a "while".

The -options extension is still under consideration. It was decided to implement this as a global qualifier, and to add a new default attribute as a third alternative to "required" and "parameter".

Thon's ACD Language guide is now in HTML. Peter will send details of changes and of the new extensions implemented to date.

It was agreed that the prompting style should change so that there is no longer an extra blank line between prompts to the user. This is a simple change in the ajacd.c source.

It was also agreed that applications should not prompt for ever for missing data. A default of 1 additional request was agreed, defined by an environment variable. This needs a simple change to the validation loop in each acdSet function in ajacd.c.

Peter has created an emboss/data directory for local data files.

The prefix for environment variable names is now automatically defined by the embInit call as "emboss_" in upper case for shell variables and in lower case for definitions in the emboss.defaults and .embossrc files. For example, "emboss_data" is now defined in code simply as "data".

4. Documentation

An overview of the libraries is still urgently needed from Peter.

Rodrigo has icarus code to remotely update documentation, used by the EMBnet Technical Manager project committee. He will send details to Peter.

5. New applications

Alan is developing dan for DNA melting analysis. This includes 2 new library source files, ajbase for base conversions and ajdan for melting. Peter suggested that ajdan may be better in the NUCLEUS library as an algorithm implementation, but should stay in AJAX for now until the distinction is clearer.

Ian has spent a couple of days looking into the regular expression functions in AJAX and testing their suitability for nucleotide and protein analysis. The conclusion was that they seem adequate if prosite motifs are converted to avoid the {2,5} form for variable gaps and to use "...?.?.?" instead. Peter is working on the later POSIX compliant Henry Spencer library which supports {2,5} and other POSIX extensions. This may be useful in some cases but performance is expected to be poorer. Sinead is also working on pattern matching to search databases with users entering their own regular expressions. Peter has a conversion of Prosite patterns to regular expressions and will send the details to Sinead.

Richard has perl code to work with GFF. He offered to migrate this to C for use in EMBOSS.

Richard proposed galen and lalen as global and local alignment engines. Some doubt was expressed over the meaning of lalen.

Peter proposed names for fast database search algorithms, bfirst and firster, but has no algorithms to justify the names.

6. Web Pages

Peter has updated the Web pages with the minutes, applications and library documentation.

Peter will create dummy documentation for all applications. ACD files are needed for all of these, and should be written as soon as possible and added to the documentation as a specification of program behaviour.

7. Any other Business

Rodrigo has contacts through EBI external services with Inge Jonassen and his PRATT pattern discovery package. This now includes 3D searching with DSSP. He would like to see EMBOSS include PRATT and will contact Inge.

Bill Pearson and John Collins are visiting EBI this week for the Biostandards workshop. John Collins has a new MPsrch implementation for Linux on Alpha using pthreads.

8. Next meeting

Next meeting Monday 8th March, usual time and place.
Peter Rice, Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton Hall, Cambridge, CB10 1SA, UK.