EMBOSS: Project Meeting (Mon 27th November 2006)


Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Mahmut Uludag,
Sanger:
Visitors:
Apologies: Shaun McGlinchey, Lisa Mullan, Rodrigo Lopez

1. Minutes of the last meeting

Minutes of the meeting of 13th November 2006 are here.

2. Software Development

2.1 Graphics library

Alan has modified Jemboss. Cygwin no longer defines the SHELL variable which is required by the Jemboss makefile. Even where it is defined, cygwin now has ksh which is used by default and upsets ther make. users need to:
setenv CONFIG_SHELL /bin/tcsh
or
export CONFIG_SHELL=/bin/bash

Alan has successfully tested EMBOSS 4.0.0 with all patches on an Intel Mac.

Peter has worked through the parsing of NCBI style IDs in FASTA files. There were various issues including whether the database name from the input file, the -sdbname option, or the -osdbname option should be used in output. Also in some cases a database name could be preserved from a previous entry - where NCBI format files had a mixture of sequences with and without databases. New attributes were required for sequence input and output objects to separate the command line database name settings from those read in the input.

2.2 New applications

Peter has added a new application extractalign to extract regions from a sequence alignment. It is a minor modification of extractseq. Another application to appear soon is wordfinder a modification of supermatcher to search for word-based near identical hits in a (protein) database to help the pathogen sequencing group at Sanger.

Peter investigated a request for a program that can handle pattern matches with very large result sets. The user is happy with the available programs. They ran slowly because he really does want an enormous number of hits.

Peter has worked through past requests for applications and features. He will collect them together so we can set priorities as a "Dear Santa" list, with help from the user community.

2.3 Other development

Mahmut reported that LSF has been upgraded and SoapLab job hangs have been fixed. Job submission is temporarily synchronous.

The database definition for EMBL has been changed to support historic ID numbers. To fix the 2.8.0 server some applications were copied from 4.0.0. This appears to work.

Error messages for database access could be improved, especially for SoapLab users who are guessing the USA syntax. It would be useful if documentation URLs could be given in the SoapLab error messages.

Peter had discussions with the Taverna/OMII people in Manchester about improving metadata support. SoapLab should report pairs of mutually exclusive options (direct_data and usa sequence input for example) as a version of the mandatory value. We also should identify more carefully the minimum set of inputs needed for an application to run so that these ports can be coloured and controlled more easily in Taverna. The discussion also included ways to notify Taverna that an application (or an entire soaplab server) is obsolete and to point to an alternative server.

3. Administration

Alan reported that JEMBOSS has been patched to handle the new pattern ACD type. There is a fix to the making of patch files to find new files as well as changed files.

Peter is having meetings with local EMBOSS users to discuss database configurations and application needs. The first meeting was be with Babraham Research Institute last week. The ability to combine databases would be useful as they have split EMBL into a number of subdatabases. They are looking for an interface to suit their users needs which include project management functions. Customizing wEMBOSS was recommended as something worth investigating.

Alan is still working on obtaining purify.

4. Documentation and Training

Jon has completed conversion of the text from the ACD syntax documentation, and the developers' course talks and practicals. Placeholders are in the text. Master contents and XML outlines will be completed next month.

Remaining work includes filling in a few gaps in the text, making the style consistent, and checking the automatically generated content. The nucleus library function names still need to be cleaned up, although in most cases they are reasonably consistent already.

5. User queries and answers

The list was reviewed. Everyone should review the current list at sf.net/projects/emboss/ and close those already dealt with. All requests have been assigned (mostly to Peter).

All new issues were considered to be resolved.

6. AOB

7. Date Of Next Meeting

The next meeting is on Monday 11th December. This will be the last meeting of 2006.