EMBOSS: Project Meeting (Monday 20th Sep 2010)

EMBOSS: Project Meeting (Mon 20th September 10)

Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Mahmut Uludag, Michael Schuster
Visitors:
Apologies:

1. Minutes of the last meeting

Minutes of the meeting of 6th September 2010 are here.

2. Maintenance etc.

2.1 Applications

Jon has specified about 30 new applications, in addition to the ontology applications added recently.

Peter has started to write code for the ontology applications.

Alan asked that template applications be defined in bundlewin.c to avoid problems when building the win32 binaries. It may be possible to check emboss/Makefile.am for applications defined in the bin_PROGRAMS line.

2.2 Libraries

Alan has renamed the ajMartHttp functions in ajseqdb and moved them to a new ajhttp source file. Four functions were moved or replaced, and now work on strings rather than extracting elements from a sequence query object. URL parsing code has also been added to ajhttp.

Alan has added proxy authentication by username and password. This is defined in the EMBOSS_PROXY value by appending "Basic username password". Basic authentication sends a simple base64 encoded username and password. There are more complex alternatives (Digest and NTLM) based on challenge and response which could be implemented but are so far not requested by any user. Proxy servers using such mechanisms can also support the Basic authentication protocol.

Peter has written OBO (open bio-ontology) functions to read, process and write OBO ontologies. The input specification uses an adaptation of USA syntax for OBO objects and OBO query fields. Output is in OBO format but can be easily extended.

The code in ajobo and ajobodb has been fully documented and passes all documentation validations. The revised function names can be adapted for other data types.

Michael continues to update the ensembl functions to be EFUNC and EDATA compatible. There are some issues arising.

Global and static variables are not documented. Peter will look into this and could add a documentation block for these variables and tests that they are documented whenever they are found.

Enums should have typedefs to enable compiler checking for duplicate values.

2.3 Other

Alan has found a way to configure gsoap which Mahmut will test before it is committed.

Mahmut has added test cases for the new AJAX functionality.

Mahmut has a new SoapLab test server on the LSF test cluster. This will be moved to the EBI production cluster.

3. New developments

3.1 Access methods

Mahmut described 3 DAS access formats for source, server and registry output. The registry can be ignored as the server output is almost equivalent. Access is similar to biomart - returning a source description and a query URL. The DAS server host name can be used but the registry always gives a query.

DAS source is specific to one server, and gives a URL directly of type dasssource or dasserver. This is currently used for sequence data. DAS servers are also sources of other information, especially feature annotation.

Peter has started adding OBO access and OBO applications (see above). Similar code will be needed for the database catalogue and the NCBI taxonomy.

Peter suggested a discussion next week on possible server definitions to generalise access to biomart, ensembl, SRS and other sources where many databases have a common server location and access method.

Alan and Michael suggested caching server information would be essential. As an example, SQL access discovers databases matching the EMBOSS supported API version, but this takes a long time over a slow connection. Other queries (e.g. the number of databases) can be resolved rapidly. A cache would need to include the software version and be able to check for new versions. Perhaps this could be combined with an application similar to showdb which could refresh the local cache. Use of the cache by any application could be turned off by an environment variable.

3.2 EDAM

Jon has updated the copy of edamclean in CVS.

Jon is looking into OWL as a maintenance format for use with the BioNemus web service interfaces. This could help with cardinality and type information for EDAM terms.

4. Administration

4.1 Windows

Alan has compiled the current CVS code on Windows 32-bit. A substitute for ftruncate was needed, and a 64-bit fseek. Some functions in ajobo.h were undefined.

4.2 Hardware

A replacement emboss7 server has been ordered from Dell and should arrive soon (note: it arrived during the meeting).

5. Documentation and Training

5.1 Books

Alan has added a documentation CVS repository.

Jon has updated the XML documentation to describe tests and flags for minimum and maximum values in ACD.

CUP have provided contact details for a copy editor.

5.2 Training

Vicky Schneider has asked about the timing of possible online training modules. These can be developed for release when the books are published, as they include text from the books.

6. User queries and answers

All done.

7. AOB

Peter has contacted the SAB about possible meeting dates. The end of October looks to be a good time.

8. Date Of Next Meeting

The next EMBOSS meeting will be on Monday 27th September.