EMBOSS: Project Meeting (Mon 27th September 10)
Alan will check again for a consistent solution. Testing is limited as we no longer have test systems for AIX, HP-UX, Compaq or Solaris with the native compiler. We can also seek help from the emboss-dev mailing list to test on other platforms.
Alan reported on the new configure options for gsoap library configuration.
Installing gsoap requires C source files using a given set of WSDL definitions, e.g. EBI's Wsebeye, to invoke 2 programs defined in gsoap.m4. This macro is used when --with-gsoap is defined, invoking the two programs to create C source files in ajaxdb before AJAX is compiled.
But gsoap also looks at HAVE_CONFIG_H so we also have to set config.h for EMBOSS, EMBASSY, and EMBASSY with EMBOSS installed, and we need to get the correct search order for config.h files.
Alan needs to do more work to get gsoap working on Windows. It will be useful for remote data access as the EBI SRS server is the usual method on Windows but has a very poor response time.
The gsoap code generation creates '//' comments which some compilers will not like.
Michael reported configure problems on MacOSX which will be fixed later today.
Mahmut has checked in a partial implementation of DAS sequence access and is looking at how to handle features.
The UCSC genome browser and CHADO are possible new data sources. The "Data Resources" wiki page has details of UCSC access through a REST API, mainly MySQL but using a completely different schema with a new table for each new database. Michael described CHADO's data table which is too flexible to easily use, configured to store generic objects but difficult to use to retrieve specific objects.
Peter has extended the "USA" syntax to more general access for any datatype, including sequence, feature and OBO. Query fields need to be made general (not only the sequence-specific ones hardcoded in EMBOSS).
Text access can now be made general for any data type. Where there is no text (e.g. Ensembl data retrieval as an object) we should report that text is not supported.
PURLs have been created for all EDAM terms. The submission script works, but there is a server timeout if too many are created.
The script uses a directory of XML files, with one term per file and can be resubmitted to make sure.
The scripts are committed to the edamontology CVS server.
Online EDAM documentation is updated for the new formats, including html, xml and json representations of OBO.
The edamclean utility is updated to write modified files.
Peter noted there are two copies of EDAM on the NCBO portal, as the beta_08 release is still there.
Jon is now working through the EDAM topic terms to simplify them. A maximum of 3 levels deep should be sufficient. For data curators, a moderate number of well-structured terms is needed.
Once completed, Jon will review the annotation of 1700 services in BioCatalogue.
Jon reported on Dmitri's development of the "BioNemus" WSDL editor, which reads in an OWL version of EDAM to define data structures. EDAM will continue to be maintained in OBO, but this can be used to generate a purely semantic OBO and OWL version for BioNemus to use.
Jon will take a break before checking the next book.
Indexing is an issue still to be addressed.