|
EMBOSS: Project Meeting (Mon 16th August 10)
|
Attendees
EBI:
Peter Rice,
Jon Ison,
Mahmut Uludag
Visitors:
Martin Senger (KAUST)
Apologies:
Alan Bleasby
1. Minutes of the last meeting
Minutes of the meeting of 26th July 2010 are
here.
2. Maintenance etc.
2.1 Applications
Peter is investigating a reported bug in einverted where
an inverted repeat with a very short interval may be incorrectly
reported. The original code is still available from Sanger and can be
used to test for the original behaviour of the application.
Jon has updated the relations: attributes in all ACD
files for EMBOSS to use IDs in the form "EDAM:0123456" followed by the
namespace and full name. ACD files for all EMBASSY applications are
also being updated, reviewing input and output relations and adding
topic and operation terms to the application block.
Jon will review the protein structure EMBASSY packages to
remove obsolete applications, and to migrate others into the main
EMBOSS package.
2.2 Libraries
Peter has added debug calls to ajnam.c to report the
file in which a variable, database or resource was
defined. Both Peter and Mahmut have recently needed to
trace multiple definitions in emboss.default, .embossrc
and include files.
Mahmut reported a bug from external services where
an ajnam.c error message fails to name the variable which
caused the problem.
2.3 SoapLab
Mahmut has checked in accumulated changes to the SoapLab code,
merging them with the main branch. He has also taken a first look
into gsoap library.
Martin plans a SoapLab 2.3.0 release this week,
with Mahmut's typed services and full support for EMBOSS 6.3.1
and for the new EMBOSS 6.4.0 ACD relations attribute syntax. The typed
services will also be fully supported and documented.
Martin and Mahmut have decided not to update the Taverna
plugin for SoapLab until Taverna has completed a move to an OGSI
internal framework. SoapLab can be deployed in a Taverna-compatible
configuration so that existing users remain supported.
3. New developments
3.1 Dbx indexing
Peter has tested a cleaned up ajindex.c
and embindex.c. Many references to string internals have been
replaced by string macros. Temporary arrays and objects are created
once rather then each item they are used. Very few of these arrays are
needed in practice. Functions that accepted char* now use AjPStr
parameters. Internal page caches are now only written when the page is
clearer from the cache, saving a large number of page rewrites. Using
a table for cached pages allows rapid lookup and a much larger cache
size, again reducing the number of page rewrites. The dbx
indexing applications now use more memory - because of the optionally
increased cache size - but run much faster mainly because of the
reduced number of index file write operations.
New application dbxreport tests the index is clean and has all
pages used. In debug mode it also reports the full contents of all
pages. This code can be extended to selectively report on any specified
page, probably in a new application.
New application dbxstat reports on index terms used a given
number of times (for example: more than 1000, less than 2, or 200 to
999 times). This can be used to test the expected number of entries is
retrieved.
Preliminary tests show that indexing is
significantly faster and the index files have no unused or unlinked
pages.
3.2 Dbxref
Jon has revised the syntax of dbxref.dat and updated the
documentation at the top of the file. The code in ajresource
has been updated to handle the new line types and fields.
3.3 EDAM
Jon has added up to 50 new high-level terms to simplify the
hierarchy of concepts.
3.4 New applications
Jon has committed stub code and ACD files for new applications.
The db applications search dbxref.dat by key, EDAM
category, datatypes returned, query field types, etc.
The edam applications provide a semantic search, using EDAM
terms in the ACD relations attributes, of all EMBOSS and EMBASSY
applications.
The onto applications mirror the functionality of the Ontology
Lookup Service to navigate OBO ontologies (including EDAM) and could
be used to build a generic ontology browser.
4. Administration
Peter has corrections to BAM and SAM sequence reading to be
added to the next patch.
5. Documentation and Training
5.1 Books
Jon is waiting for news from the publishers.
5.2 Website
Jon has added Chipster to the list of EMBOSS interfaces.
6. User queries and answers
All handled.
7. AOB
Martin thanked the team for making him welcome for his 3 week
visit.
8. Date Of Next Meeting
Next week Peter is on vacation. The following week is a public holiday.
The next meeting will be on Monday 6th September.