EMBOSS: Project Meeting (Mon 30th March 09)


Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Mahmut Uludag
Sanger:
Visitors:
Apologies:

1. Minutes of the last meeting

Minutes of the meeting of 13th March 2009 are here.

2. Software Development

2.1 Applications

There is a request for the GFF feature outputs from wordmatch to be optional. This can be done with a minor change to the ACD file to allow -noaoutfeat and -noboutfeat on the command line. It was felt that it is best to keep all 3 output files to simplify third-party EMBOSS interfaces.

Peter has updated pepwindow and pepwindowall to correctly process the sequence start and end positions. Both applications were missing the final residue. This is now fixed. The amino acid data files can be (re)normalized to have a mean of zero and a standard deviation of 1.0 to give consistent results. Some of the Nakai values (from aaindexextract) are unnormalized. Note that renormalizing a normalized set of values can give slightly different numbers. The graph title of both programs is made more general - it assumed hydrophobicity values.

Peter has modified applications that process patterns (prosite or regular expression) so that the suffix "1" is no longer added where there is only one pattern.

Mahmut reported emma can fail to write a DND file if a SoapLab job has many sequences with the same name.

2.2 Libraries

Peter has added code for a "das" graphics output intended for xy plots to produce output compatible with DAS standards. The output has a feature for each data point, with the y-axis value as a score. This needs to fail gracefully for other graphs - this will need changes to the validation of graph devices to allow some to support only one type of graph. Changes are also needed to attach sequence details to the graph object so these can be reported in the DAS output.

Mahmut has looked into Ensembl DAS stylesheets.

Peter has fixed the alignment code to use double precision calculation. For long alignments it was possible to lose track of the path at single precision. Function results are also changed to return double precision score values.

2.3 SoapLab

Mahmut reported SoapLab is running well. One job failed reporting an NFS error in finding an executable. This is now fixed.

The recent hit count is 2m/month, though over 75% are in the load balancer. About 500k are real hits. Many of these are Taverna plugin startups requesting service metadata. The LSF job count shows many hits are real jobs running.

2.4 Other

Alan has updated the configure script to check for X11 header files and to report if X11 is not found, especially for Linux and MacOSX systems.

Alan has built a new mEMBOSS with the alignment algorithm patch. The bundlewin utility failed to find exported functions if there is a tab at the start of the prototype in the header file.

3. Administration

Alan has had a reply from Apple on access to "Apple Developer Connection" to get downloads of MacOSX updates.

4. Documentation and Training

4.1 Books

Jon reported CUP may need HTML intermediate files as sources for the books.

Peter has updated the "command line data qualifiers" section. Examples were changed from EMBL to SwissProt so that ID and accession number are different (EMBL changed policy on identifiers since the original documentation was written).

Editors for the book text were discussed. Emacs was preferred to XMLmind.

4.2 Training

Alan reported that Lisa's recent training course at EBI went well.

4.3 Website

Alan has installed PHP and mySQL on the Wiki for form generation.

5. User queries and answers

Jon has updated the lists on SourceForge with recent bug reports and requests where some issues were remaining.

Alan has modified eprimer3 to report the original output. The application is parsing and rewriting the output. An option to produce the parsed results was added.

The meaning of "reversed" in diffseq output needs checking. An insertion in the second sequence can give strange results.

Mahmut also reported that dotmatcher with a short second sequence crashes with an unsigned integer condition. This will need a general check through all applications for similar problems.

6. AOB

None.

7. Date Of Next Meeting

Two team members are on holiday in 2 weeks time. Peter is away at a conference on April 27th. The next meeting will be in May, rescheduled as May 4th is a bank holiday (May Day and Luke Skywalker Day).