EMBOSS: Project Meeting (Mon 2nd November 09)
Peter will check a reported problem with extractfeat posted on SourceForge as a mEMBOSS problem but needing testing under Linux too.
Mahmut is checking the valid ranges for gap penalties to make them consistent across all applications. Some applications set an upper limit on the gap open or gap extend penalty, and the upper limits may be higher for some applications. Peter suggested allowing the higher limits as some users may define their own comparison matrices using higher match values - for example when converting a floating point matrix to integer by multiplying all values by 10.
Mahmut noted that where the 'valid' attribute is defined in an ACD file the precision (number of decimal places) may be inconsistent. Mahmut has committed two test cases for needleall to test the new score functions in embaln.c. These are the current QA tests for needleall.
A user has requested that wrapper applications should be made obvious so that interfaces know when another package needs to be installed. Peter suggested extending ACD or reusing some of the SoapLab extensions to define the third party applications wrapped by EMBOSS. He will make a complete list by searching through the source code.
Peter has extended the internal definitions for each graphics device to include the default size (from reading through the code) and a boolean to note whether a user-defined size will in fact be used. Functions to test the size of the plot now return the user defined size (which plplot returns) or the fixed size (which plplot has actually used, as appropriate for each device.
Peter has removed the AjPGraphxml definition from ajgraph. This was only used for Hugh Morgan's 'grout' graphics code which was never included in the release. The only application using XML graphics output was the test application giep and inspection shows that this code used a '#ifdef' block to redefine the graph object and to use alternative code for XML and plplot graphics.
This allows the removal of the 'plplot' level within an AjPGraph object, greatly simplifying the graphics code. All applications run and pass their QA tests. The revised code will be committed later today.
Alan will remove gdom and its glib dependency from the configuration. These were only used by the 'grout' graphics output.
Alan noted that plplot updates would need to be carefully planned as building it into mEMBOSS is complex and may require updating of the gd graphics library DLL.
Peter outlined plans for a 'light' version of the AjPAlign and AjPSetset objects with fewer sequence attributes so that next-generation assemblies could be loaded into memory. In each case only a few functions require extending to cope with both versions of the object, for example functions that return a sequence.
BioMart registry access starts with a description of the server at EBI, defining 'marts' from EBI and possibly other sources. These external marts could also have a registry with new entries.
Alan noted that the BioMart Perl API code uses a cache directory with a misspelled name, possibly to reduce the scope for name clashes.
Using BioMart may involve adding configuration directories and files.
Alan described access to 'marts' using remote SQL or web services. First efforts will use SQL. The registry returns a table of metadata in compressed XML. This implies we also need to add a compression library.
Alan recommended including the zlib library with functions renamed with a prefix and the installed library prefixed with 'e' as for expat. We need to avoid a name clash with functions linked in from zlib> via >i>plplot.
Jon is continuing with the revision of the EDAM ontology to conform to the documented standards. Peter will check the term types, relations and rules in the overall structure. Duplication has been removed, for example sequence alignment was a record and datatype. The record term is not required. The beta release will be simpler and easier to use.
Paul Gordon has suggested adding terms in WSDL files for syntax (a requirement of SAWSDL), and given examples of the 'schema lifting' hierarchy he uses for BioMOBY service definitions. SAWSDL is only meaningful if input and output schemas are annotated at the data level. The main method can be annotated at the top, but multiple methods cause problems.
Alan suggested ordering an OEM version of Windows 7 with a new disk drive. When building on 64-bit windows systems, Jemboss works under XP but needs 32-bit java installed. We can try using 64-bit java in Jemboss.
Alan has sent a reminder email to systems about the server specification.
Jon reported that the package structure XML files need editing to cover wrapped applications.