EMBOSS: Project Meeting (Mon 10th December 07)


Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Mahmut Uludag
Sanger:
Visitors:
Apologies: Martin Senger, Shaun McGlinchey, Rodrigo Lopez, Tim Carver

1. Minutes of the last meeting

Minutes of the meeting of 26th November 2007 are here.

2. Software Development

2.1 Applications

Alan has found and fixed the dbxflat indexing bug that only appears when EMBL release files are processed in a particular order. The problem was in sorting file pointers, but only for some entries in very large input files (over 2Gbytes). A fix will be provided. If indexing succeeded then the index is good so no reindexing is required.

EMBL 93 is due out this week. The corrected dbxflat will be tested on the new EMBL release.

Peter is working through the differences between phylip 3.6 beta and the latest phylip 3.67 to bring the EMBASSY package up to date. Many of the changes to 3.61 were already included in the current PHYLIPNEW release. The code has to be converted rather than wrapped as the interactive menu system is not easy to control.

Alan is working through the differences between ViennaRNA 1.6.4 and the new 1.6.5 release. The code has to be converted rather than wrapped as interactive and batch use give different results.

2.2 Libraries

Peter has made improvements to file input and string handling to provide improved performance. File input uses fewer string handling calls. The string library source code uses more macros and fewer calls to other functions. These changes were suggested by gprof profiling analysis of dbxflat runs.

Peter has fixed a bug in SwissProt output format which produced files that could not be read in as protein. The cause was a change to the ID line for the new SwissProt syntax which was not recognized by the sequence reading functions, resulting in a default to EMBL format and marking the sequence as nucleotide.

2.3 SoapLab

Mahmut has fixed a CLASSPATH problem in the Taverna SoapLab plugin. The fix required hardcoding the path for now, but this will be fixed in Taverna.

Mahmut has shown that remote debugging of java processes can be easier than debugging through eclipse for taverna and SoapLab issues.

2.4 Other development

None.

3. Administration

3.1 Patch to 5.0.0

Alan and Peter will make a patch to include the bugfixes for psiphi, dbxflat and SwissProt sequence format.

4. Documentation and Training

4.1 Books

Jon, Alan and Peter had a meeting with the book publishers (Cambridge University Press) last week.

Jon has completed the draft for the books and has started the validation and editing stage.

Jon plans tutorials in the use and editing of the DocBook XML sources for the other book authors.

5. User queries and answers

No new queries outstanding.

The UniProt team asked about EMBOSS handling of long lines in sequence databases. The latest release of UniProtKB includes some lines that are longer than 255 characters. Apparently GCG's "embltogcg" application fails on these. Peter confirmed that EMBOSS has no problem and suggested providing a simple script to split long database lines for GCG users as they cannot expect a fix from Accelrys.

6. AOB

Happy birthday to Tim

7. Date Of Next Meeting

December 24th is Christmas Eve. The next meeting is on Monday 7th January.