EMBOSS: Project Meeting (Mon 22nd November 10)


EBI: Peter Rice, Alan Bleasby, Jon Ison, Michael Schuster
Apologies: Mahmut Uludag,

1. Minutes of the last meeting

There were no meeting last week because of a clash with a conference.

Minutes of the meeting of 8th November 2010 are here.

2. Maintenance etc.

2.1 Applications

Alan has modified tfm to use the new EMBOSS_DOCROOT variable value for the doc/ root directory so that text and html versions will both be able to find the appropriate application documentation files.

Peter has updated list and select types in ACD to support the use of '*' for all values. This is useful for several applications as a default or a user-supplied value.

Peter has added applications to index EDAM, OBO files and DRCAT.

Mahmut is working on a vectorstrip issue.

2.2 Libraries

Alan has updated the ajdom parser wit extra functions, for example to return the text field of a text object or to return the node of the first child of an element to a search list to avoid additional calls. I some cases memory was not fully cleared but was seen by valgrind as only indirectly lost. These are now cleaned.

Valgrind also reports an error from get_addrinfo which should be added to the suppression list.

Peter has modified the ajindex code to automatically convert keys to lower case and to replace spaces with underscores. A new function ajStrFmtQuery does the transformations.

Peter has extended the possible inputs for lists and selects in ACD to support the use of '*' for "all options". This is only allowed where the maximum number of values matches the number of possible values.

2.3 Other

Peter has added 7 files from the NCBI taxonomy which will be indexed to allow the use of NCBI taxon ids in EMBOSS.

Michael has some configure changes to support the Intel icc compiler.

3. New developments

3.1 Data access methods

Michael has cleaned up the Ensembl code for EFUNC and EDATA documentation.

Ensembl now has a version 60 which will need migration.

Generating a cachefile will need access to DB definition internals. This may be a common issue with BioMart access, and possibly also DAS.

One approach is an application to connect once to a server and dump a list of databases which can become a public list with a link in the emboss.standard configuration file.

For Ensembl, it would be helpful to have a list of database aliases to avoid duplicating DB definitions for all possible names. This may also apply to BioMart and DAS.

3.2 EDAM

Jon has prepared a new EDAM beta to be released early in December. The hierarchy of terms has been improved to make navigation easier, especially for the data and operation name spaces.

The topic name space has been simplified into fields of study for use by BioCatalogue.

Peter will add new applications that use the EDAM relationships for searches. Each relationship is specific to certain name spaces or branches.

Jon has removed the inverse relationships as these can be automatically generated or inferred by reasoners. For example, is_format_of is retained, but has_format has been removed. If users need them, we can regenerate them for any release.

EDAM term names are being standardized with the consistent use of prediction, detection, etc.

Matus is updating EDAM format terms using new terms found in DRCAT.

Matus will continue to work on EDAM. A paper is in preparation. Matus will visit next week, then go to the Berlin semantic web conference with Jon.

Two other ontology projects at EBI are interested in OWL representations of OBO data.

3.3 Large data files

The index files for EDAM, DRCAT and NCBI taxonomy are large and cause problems in CVS.

Jon has checked with EBI's ontology lookup service (OLS) who have access to a developer machine to avoid the need to use anonymous CVS on Open Bio.

4. Administration

4.1 Fedora

Alan has installed Fedora 14 on all machines.

The EMBOSS distributed with fedora 14 failed to run any application. There is a problem with the va_list architecture when AMD64 is defined. There is also an issue with using an external plplot, but it is now using the EMBOSS_supplied version.

4.1 Hardware

Mahmut's machine needs a new Ethernet card.

After the recent air conditioning failure in the EBI machine room, the EMBOSS server rebooted successfully.

5.0 Documentation and training

5.1 Books

Jon has completed the markup of index terms. Indexes are generated for a basic set of terms, with a list of equivalent terms and phrases to be combined, for example datatype-specific and associated qualifiers.

The indexes will be sent today.

6. User queries and answers

All done.

7. AOB


8. Date Of Next Meeting

The next EMBOSS meeting will be on Monday 29th November.