EMBOSS: Project Meeting (Mon 16th August 10)


Attendees

EBI: Peter Rice, Jon Ison, Mahmut Uludag
Visitors: Martin Senger (KAUST)
Apologies: Alan Bleasby

1. Minutes of the last meeting

Minutes of the meeting of 26th July 2010 are here.

2. Maintenance etc.

2.1 Applications

Peter is investigating a reported bug in einverted where an inverted repeat with a very short interval may be incorrectly reported. The original code is still available from Sanger and can be used to test for the original behaviour of the application.

Jon has updated the relations: attributes in all ACD files for EMBOSS to use IDs in the form "EDAM:0123456" followed by the namespace and full name. ACD files for all EMBASSY applications are also being updated, reviewing input and output relations and adding topic and operation terms to the application block.

Jon will review the protein structure EMBASSY packages to remove obsolete applications, and to migrate others into the main EMBOSS package.

2.2 Libraries

Peter has added debug calls to ajnam.c to report the file in which a variable, database or resource was defined. Both Peter and Mahmut have recently needed to trace multiple definitions in emboss.default, .embossrc and include files.

Mahmut reported a bug from external services where an ajnam.c error message fails to name the variable which caused the problem.

2.3 SoapLab

Mahmut has checked in accumulated changes to the SoapLab code, merging them with the main branch. He has also taken a first look into gsoap library.

Martin plans a SoapLab 2.3.0 release this week, with Mahmut's typed services and full support for EMBOSS 6.3.1 and for the new EMBOSS 6.4.0 ACD relations attribute syntax. The typed services will also be fully supported and documented.

Martin and Mahmut have decided not to update the Taverna plugin for SoapLab until Taverna has completed a move to an OGSI internal framework. SoapLab can be deployed in a Taverna-compatible configuration so that existing users remain supported.

3. New developments

3.1 Dbx indexing

Peter has tested a cleaned up ajindex.c and embindex.c. Many references to string internals have been replaced by string macros. Temporary arrays and objects are created once rather then each item they are used. Very few of these arrays are needed in practice. Functions that accepted char* now use AjPStr parameters. Internal page caches are now only written when the page is clearer from the cache, saving a large number of page rewrites. Using a table for cached pages allows rapid lookup and a much larger cache size, again reducing the number of page rewrites. The dbx indexing applications now use more memory - because of the optionally increased cache size - but run much faster mainly because of the reduced number of index file write operations.

New application dbxreport tests the index is clean and has all pages used. In debug mode it also reports the full contents of all pages. This code can be extended to selectively report on any specified page, probably in a new application.

New application dbxstat reports on index terms used a given number of times (for example: more than 1000, less than 2, or 200 to 999 times). This can be used to test the expected number of entries is retrieved.

Preliminary tests show that indexing is significantly faster and the index files have no unused or unlinked pages.

3.2 Dbxref

Jon has revised the syntax of dbxref.dat and updated the documentation at the top of the file. The code in ajresource has been updated to handle the new line types and fields.

3.3 EDAM

Jon has added up to 50 new high-level terms to simplify the hierarchy of concepts.

3.4 New applications

Jon has committed stub code and ACD files for new applications.

The db applications search dbxref.dat by key, EDAM category, datatypes returned, query field types, etc.

The edam applications provide a semantic search, using EDAM terms in the ACD relations attributes, of all EMBOSS and EMBASSY applications.

The onto applications mirror the functionality of the Ontology Lookup Service to navigate OBO ontologies (including EDAM) and could be used to build a generic ontology browser.

4. Administration

Peter has corrections to BAM and SAM sequence reading to be added to the next patch.

5. Documentation and Training

5.1 Books

Jon is waiting for news from the publishers.

5.2 Website

Jon has added Chipster to the list of EMBOSS interfaces.

6. User queries and answers

All handled.

7. AOB

Martin thanked the team for making him welcome for his 3 week visit.

8. Date Of Next Meeting

Next week Peter is on vacation. The following week is a public holiday. The next meeting will be on Monday 6th September.