EMBOSS: Project Meeting (Tue 19th May 09)


Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Mahmut Uludag
Sanger:
Visitors:
Apologies:

1. Minutes of the last meeting

Minutes of the meeting of 5th May 2009 are here.

2. Software Development

2.1 Applications

Alan has confirmed that methylation of one strand does block cutting by restriction enzymes on both strands and has updated the restriction methylation code. The representation of methylase sites in the data file now has the site as a plain sequence with a lower-case 'm' before the methylation site. Ambiguity codes are not allowed in restriction sites so there is no alternative meaning.

Peter has updated the PHYLIPNEW code to phylip release 3.68. Mahmut had found a bug in reading large input files which was fixed in this phylip release. Changes to applications derived from seqboot need to be checked, and all applications need to be run through QA tests to check for changed results before committing to CVS.

2.2 Libraries

Alan has cleaned the coding style throughout most of AJAX. Code indentation should be 4 spaces, with 4 lines between the end of a function and the header of the next function. Internal if and for code blocks have blank lines before and after. Unconditional return statements have a preceding blank line. Some void functions had no return statement. Those with __noreturn in front of the function do not return - they call exit functions. In other cases the return statement was added.

Peter will add some of these standards to the code documentation parser so that the standards can be maintained more easily.

Alan will clean up the remaining AJAX files (ajseqwrite and ajstr) and clean the NUCLEUS library. It was agreed that application code can remain in its current state as the main issue is library code examples and documentation for the books and website.

Peter has put the distance matrix input format patch on the FTP server. The code in ajphylo.c failed to read some distance matrix files where lines were long enough to be wrapped.

Mahmut identified an issue with SRS as a data access method. If only part of the sequence data was read - for example by nthseq - the call to getz failed to complete. Peter will investigate.

2.3 Other

Alan has tested the latest gd library release. This has a configuration option for an installation outside /usr/local but is unable to find libpng headers, or failed to modify the compile line if they were found. The configure script has to use CFLAGS directly. On MacOSX the configure contunied despite warnings. Although MacOSX may not have libpng installed, it finds a png.h file in the X11 SDK and includes it in its header file.

Mahmut has looked into BioPerl parsing of EMBOSS output. A user requested the start and end positions of a local alignment, which was easily resolved. The same user now wants to retrieve the full sequence ID. In the default alignment output, based on FASTA program output, the ID is truncated to 6 characters. Peter will consider changing the format, or introducing a new format for BioPerl to parse. BioPerl is capable of setting the alignment format when launching an application.

Peter and Mahmut attended a meeting on next generation sequencing bioinformatics in Cambridge. The University plans to set up a wiki to discuss the issues raised, and will organise a follow-up meeting towards the end of the year.

3. Administration

Alan reported that the disc array which was giving errors is very old and it is best to replace the discs with new, larger discs. All discs should be the same size to enable full use of disc space. Peter will order the new discs.

4. Documentation and Training

4.1 Books

Alan has rewritten the FAQ file. A few sections will need to be deleted, for example libgd/libpng installation and the tutorial link, when the book html is made available online.

Alan has reviewed and updated the Administrator Guide.

Jon has converted the "GCG to EMBOSS conversion" section to XML.

At a meeting with the publishers, it was agreed that the book text would be completed by 24th August (40 days after the 6.1.0 release).

Jon will test conversion of the text to HTML and send a sample chapter to the publishers to test for format and editing issues.

4.2 Website

Peter has been updating the "Planning" pages on the new wiki.

5. User queries and answers

Jon reported on some recent feature requests.

Peter is investigating the current MEGA sequence format which has changed from the version EMBOSS reads and writes. A local installation (free) will be needed to test the current formats read and written by MEGA, which runs on Linux under the WINE windows emulator.

Jon will look into some questions on HMMER posted directly to the sourceforge tracker.

Alan has a question from David Judge about the writing of the one line description to standard error. He is using Artemis, and should put -auto on the command line to disable writing of the description.

Alan reported an issue with the possible complexity for new developers of returning "const char*" pointers to internal strings. The configure option --enable-devwarnings is required to warn of casts to non-const types.

6. AOB

Peter will be attending BOSC and ISMB in Stockholm. BOSC will include a talk on EMBOSS and a SoapLab2 talk from Martin Senger. ISMB will include an EMBOSS poster, a technology track 25 minute demonstration and a lunchtime birds of a feather session. Last year's BoF in Toronto formed the basis of the funding application. We hope for more good ideas this year.

7. Date Of Next Meeting

Peter is away on holiday next week. The next meeting will be on Monday 1st June.