EMBOSS: Project Meeting (Mon 14th March 11) |
Functions to provide table operations for query processing required new code to resize the hash table for an AjPTable object. This has been implemented as ajTableResize. This function is also called when a table grows beyond a reasonable number of entries for a given hash table size. The code involves saving the keys and value pointers to arrays, resizing the hash array, and simply repopulating the table (reusing the previous linked list nodes).
Table merge operations can now be implemented as simple operations. Where two tables have the same hash and compare functions, and the same hash table size, keys in the same hash position can be compared. Matching keys can be moved to the start of each linked list, leaving a known list of matching and non-matching keys in each table. These can then be simply processed to leave the resulting merged table and the remaining key/value pairs.
Cleanup of the remaining keys and values would require further code to delete both keys and values and to free table nodes. Mahmut suggested adding destructor functions to the table definition. This very neatly solves the problem by allowing an ajTableDel function to be implemented to clean up any table for which functions are defined.
Peter and Mahmut are working on function names to define or set the destructor functions for tables with standard and user-defined key and value types.
Michael proposed also adding a reference count to the AjPTable object so that copying tables for objects in the Ensembl API could simply increment the reference count. This requires only one new copy constructor for any table key type, and a test of the reference count when a table is deleted.
Michael also proposed a reference count and a data destructor function for AjPList objects.
Peter will implement all these suggestions as soon as possible.
Mahmut is looking into 'bigwig' and 'bigbed' formats.
Alan has fixed a memory leak in DOM parsing when handling doctype metadata. A few minor memory leaks remain in ajdom.
Mahmut is testing SoapLab services on the new EBI London Data Centres.
A bug in SoapLab on tomcat 7 has been fixed.
field: "sv SeqVersion ! Sequence version or GI number"
The spaces around the delimiter are required. This may be changed to simply stop processing at the delimiter, as parsing only happens when the field (or other) attribute is used for a database. The configuration files are read as lists of unparsed strings.
Constants in the Ensembl API code are no longer defined by macros, making debugging easier.
The Ensembl API is tested by an application which exercises the major functions and writes a FASTA sequence file which is compared to previous versions.
Mahmut has removed an unused function from Eb-eye access.
Mahmut is moving the handling of database identifier, return, filter and accession attributes from AjPSeqin (sequence specific) to AjPQuery (general use as part of AjPTextin).
Mahmut suggested moving SQL access from sequence to text access so that it is usable for features and other data types, for example for CHADO access.
Changes to the data branch and a possible identifier branch will be discussed after this release.
All query elements now have EDAM annotation. Some are quite general and may be extended later. Where necessary, new terms were added to EDAM.