EMBOSS: Project Meeting (Mon 9th May 11)


EBI: Peter Rice, Jon Ison, Mahmut Uludag,
Apologies: Alan Bleasby, Michael Schuster

1. Minutes of the last meeting

The last 3 meetings were cancelled (vacations and two public holidays). Minutes of the meeting of 11th April 2011 are here.

2. Maintenance etc.

2.1 Applications

Peter fixed a bug in dbifasta which failed when an ID line failed to parse, and another bug which prevented the specification of one of the GCG format variants.

New applications documentation has been uploaded to the EMBOSS Wiki.

2.2 Libraries

Mahmut will look into a fix for HTTP redirects where the host address is missing. Peter suggested extending the version number to so that EMBOSS will have the same length version number on Unix and Windows. This will greatly simplify the tests for output content in qatest.dat.

Mahmut asked whether EMBOSS should support multiple sequence regions in GFF files. Peter will investigate, but it is likely that this would complicate the inclusion of sequence data so the current interpretation of one sequence region per GFF file (until the next GFF header) will remain.

Peter will modify the handling of unknown feature tags sin GFF so they will appear under their original name. They can be automatically converted to 'note="*tag: value" form on input, and converted back on output.


Peter has modified the qatest.pl script to work on Windows with the Visual Studio 2010 build. The script now auto-detects Windows using the output of embossversion which now reports "windows" as the system. There are minor issues with the locations of the emboss.standard and emboss.default files and the server cache files which Alan will be consulted on.

The modified script can now automatically find the test and documentation directories and so can be run from anywhere, always storing the results in the test/qa subdirectories.

File names are reported as a mixture of POSIX and Windows where EMBOSS reports the filename string. A function is needed to generate a clean Windows name for Windows installations.

2.4 SoapLab

Mahmut has completed the move of SoapLab to the London servers. These can now be upgraded to EMBOSS 6.3.1. External Services have requested changes for message path names in WSDL to be parameters, but this would break BioCatalogue and EMBRACE registry scripts.

Mahmut has added a method (GetSomeResults) to return partial SoapLab job results. XML typing issues were resolved.

HTTP links to the Spinet interface were fixed.

2.5 Other

Mahmut is working on Jemboss input types. Feature inputs are not working, and output formats need to be added for the new data types.

3. New developments

3.1 Access methods

Mahmut has successfully retrieved sequences and features from a CHADO server. The code has been checked in.

One DAS access test is currently failing. Mahmut will try to improve DAS feature mapping.

3.2 Query language

Peter fixed a bug in query language processing where two regular expressions are tested, but strings are extracted from the second expression only for validation. More QA tests are needed to rigorously check parsing, especially for bad query strings.

3.3 EDAM

Jon is writing up EDAM, and completing the latest cleanup.

3.4 New applications

New applications for the release were discussed. Peter plans to add applications to use cross-references from UniProt and EMBL/GenBank, Go, and the NCBI taxonomy to refine results and searches.

We should also add applications that populate and report on assembly input objects from BAM or SAM format files.

4. Administration

Peter has a 10 minute talk on EMBOSS new developments at BOSC and has submitted an ISMB technology track demonstration.

Jon has submitted an EDAM presentation to the Bio-Ontologies SIG.

Peter has ordered a replacement backup drive for emboss5.

5. Documentation and Training

6. User queries and answers

All done.

7. AOB


8. Date Of Next Meeting

The next EMBOSS meeting will be on Monday 16th May.