EMBOSS: Project Meeting (Mon 17th January 11)
1. Minutes of the last meeting
Minutes of the meeting of 10th January 2011 are
2. Maintenance etc.
Mahmut reported memory leaks in the DOM parser when processing
comments at the start of an XML file.
Mahmut will move the code for the wsdbfetch access
method to ajtextdb.c to make it available to all datatypes.
Mahmut proposed discarding the gsoap library code as
EMBOSS has the required functionality without calling this library.
Mahmut will move SOAP access code from the ajax/ajaxdb
directory to ajax/core to make it generally available. HTTP
access code was moved to ajax/core a few months ago.
Alan has updated ajnam.c to meet the EMBOSS coding
3. New developments
3.1 EMBOSS configuration
Alan is writing code to generate a server cachefile for
BioMart. It may be helpful to add a "cachedirectory:" attribute to
hold further information.
Mahmut is writing code to generate a server cachefile for
dassources. These data sources can make sequence or feature
queries. Results can be obtained directly via HTTP rather than the
present downloading of a dasgff file. This raises some issues
with servers needing both "sequence" and "feature" types.
Peter will check on server configuration error messages
e.g. "method not recognised". Some error messages have been cleaned up
to only be reported once, so it is possible some return codes need to
be tested and new messages added.
Peter will write applications to describe servers
(showserver) to assist developers writing cachefile code. Other
applications will be needed to describe individual databases and
servers in full detail.
3.2 DBX index files
Peter outlined a proposal to compress dbx index files. Each
page in the index starts in the first byte and leaves the end of the
page untouched. By inspecting each page type it may be possible to
identify the end of the data and to pack pages in the index by moving
them up. All page references will need to be identified and altered to
the new page offsets. It will also be necessary to uncompress index
files so that index updating code can be used. Peter will
consider the implications and report back next week.
3.3 Text data
Peter has added "text" and "textout" as new ACD datatypes. These
allow the entry text to be returned from any database that does not
have a type-specific parser. It will be especially useful in
combination with definitions in DRCAT.
New application textget returns the text of a type: "text" entry.
The large binary index files have been deleted as the CVS server cannot cope.
Alan suggested large files could be served by some other
download mechanism, perhaps rsync from the FTP server so that a
simple script can be provided for developers to update data and index
files in addition to a "cvs update".
5. Documentation and Training
Alan has sent his amendments to the Admin book. Peter
and Jon will review their books this week.
Jon is writing an EDAM paper.
6. User queries and answers
Peter suggested ISMB "Technology Track" demos in Vienna for
EMBOSS and EDAM. The EDAM talk by Matus in Boston 2010 was well
8. Date Of Next Meeting
The next EMBOSS meeting will be on Monday 24th January.