|
EMBOSS: Project Meeting (Mon 4th October 10)
|
Attendees
EBI:
Peter Rice,
Alan Bleasby,
Jon Ison,
Mahmut Uludag,
Michael Schuster
Visitors:
Apologies:
1. Minutes of the last meeting
Minutes of the meeting of 27th September 2010 are
here.
2. Maintenance etc.
2.1 Applications
Alan has a version of eprimer3 working with the new
version of primer3_core. This application needs a new name. In
the absence of any version-specific naming in primer3, the new name
will be eprimer32 and remain the same even if primer3 version
3.3 has the same interface. The handling of external applications will
require a new name to be defined for primer3_core for use in
the eprimer32.acd file.
2.2 Libraries
Michael has continued to clean up the EFUNC and EDATA messages
for the ensembl library, and to work on Intel compiler warnings.
2.3 Other
Alan has replaced the compiler configuration code for the main
package and all the EMBASSY packages. The configure.in code only
tested gcc and the proprietary cc compiler. This code has been
replaced by case statements allowing tests for other compilers
(e.g. the Intel icc compiler).
Similar changes simplify the operating-system dependencies in the
configure.in scripts.
Alan has also looked further into large file configuration options.
3. New developments
3.1 The gsoap library
Mahmut has been working on using the gsoap library which
works with web services described via a URL. Two utility applications
in gsoap convert a URL to a C header file, and use the
resulting header to make stub C source code files for a SOAP client.
First tests use the EBI services wsdbfetch and wsebeye.
Alan has investigated options for compiling and
linking gsoap with the EMBOSS libraries. Certain functions in
the stub file are required. Other functions must be in the ajaxdb
library. The URL used to generate the stub code must be included in
the gsoap.m4 file and can not be user-supplied.
Alan noted that Fedora provides a shared gsoap
library. BY default gsoap only builds as a static library. It is
possible to kludge single-pass linking with the GNU linker but only
for one version of the library build. If the library appears twice
then libtool is likely to object.
Possible options include extracting libgsoap code into ajaxdb,
but this will probably run into licensing issues. There are about 15k
lines of code in total in gsoap. Alternatively, a kludge
library could provide the
functions "soap_putheader", "registration_putheader"
with a callback to a renamed stub function.
We can also consider the use of axis or csoap. The
latter has a libxml2 dependency and is limited to the old SOAP
1.1 protocol with no new csoap release for the last 5 years.
3.2 DAS access methods
Mahmut has looked into gsoap and expat to handle
the XML C bindings for reading DAS features. Alan suggested
looking into DOM parsing with an XSD file as possibly easier to
implement. There is an example application domdemo in "make
check". One issue with gsoap is the possibility that we may be
unable to find a satisfactory way to link and distribute so it is
better to avoid using it for anything not directly SOAP-specific.
3.3 EBI changes to web services
Mahmut is working with EBI external services on the new EBI
interface for web services through their test server. The test
application is called dbfetchexplore. The new interface returns
network and query information. "runtestmethods" runs a test query.
3.4 EDAM
Jon has started a cleanup of the EDAM topic branch to cover
topics needed for the annotation of the current set of services in the
BioCatalogue.
3.5 Text access in EMBOSS
Peter has converted all single file-based access methods
in ajseqdb to handle general text inputs. A new "ajtextdb.c"
source file handles text input using an AjPTextin input object. This
has the attributes used for general text access by the existing
sequence and OBO access methods. The AjPSeqin and AjPOboin objects
include an AjPTextin as their "Input" element. Each access method
defined for a database is first tested against text methods and then
against methods specific to the data type. Where an access method is
to be called, the code expects to find either a text access method
using the AjPTextin input element, or a type-specific method using the
AjPSeqin or AjPOboin object. Text access results in an open file
buffer with the pointer set to the start of an entry. A parser
(defined by the database "format") processes the data in the file.
Text-based access will enable the data resources in the dbxref file to
be easily defined as EMBOSS data sources, at least as text entries
through a URL query. Jon is updating the query lines given by
the resource3 providers to standardise the semantics and naming using
EDAM terms to define a set of interoperable field names.
The query-handling code was made more general to handle OBO terms as
well as sequences. It has now been made completely general, with only
an AjPQuery object defining the field name, query, and a link operator
between queries.The link operator can be "Else" (id, else if that
fails try accession) or "or" to continue adding more results. Further
operators can be added in future, and processed explicitly for access
methods such as SRS. The query language will need to be extended to
allow these to be defined on the command line through a USA (or the
equivalent for other input data types).
The code is working and will be further QA tested and passed through
the valgrind test suite before it is committed.
4. Administration
4.1 Hardware
We are waiting for systems to reinstall the replacement emboss7
server. Peter will send them a reminder.
5. Documentation and Training
5.1 Books
Jon has the copy editor's version of the Developers Guide. The
processing by the typesetters has introduced formatting errors which
are clear from a comparison to the word document original. We hope the
copy editor can handle the amendments.
6. User queries and answers
All done.
7. AOB
None.
8. Date Of Next Meeting
The next EMBOSS meeting will be on Monday 11th October.