EMBOSS: Project Meeting (Monday 30th October 2006)

EMBOSS: Project Meeting (Mon 30th October 2006)

Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Shaun McGlinchey, Mahmut Uludag, Rodrigo Lopez
Sanger:
Visitors:
Apologies: Lisa Mullan,

1. Minutes of the last meeting

Minutes of the meeting of 2nd October 2006 are here.

2. Software Development

2.1 Graphics library

Alan is continuing work on ajgxml.c to replace the obsolete ajgraphxml.c functions, using giep as an example application. The number of function calls is being extended. When complete the old (and unused) ajgraphxml can be removed. All information is currently stored in PLPLOT data structures which will need to move to a new AJAX graph data structure.

The ajdom library is performing well so far. One issue is that as DOM libraries need to store all information in memory we have to consider better ways to extend large numbers of strings. Peter suggested modifying the string code to double string size when appending.This is already done when using formatted writes to a string, but for simple string appends they are extended by only a few bytes each time (typically 32).

Peter is continuing to test the graph title modifications.

2.2 Other development

Shaun is testing the Java version of the dispatcher with Alberto. The webservice project has been migrated to Maven2, which is good for development and deployment of web services and support of plugins for JAX-WS, Axis, etc. There are a few problems with dependencies in conflicting jar files. There will be a demonstration at the next meeting.

Mahmut has updated the EMBRACE Wiki internal pages for all SoapLab services, and there are now 140 documented test cases for SoapLab service operation.

Mahmut has started working on typed SoapLab services using Peter Ernst's SoapLab code, modified to use a single sequence type. This is now working on the test server. Using one sequence type should work well.

Alan raised the question of WS-I web service metadata documentation. Shaun will look into this. Peter will raise the issue at the new EMBRACE executive board meeting and work with the EMBRACE technology work package members. Rodrigo has similar issues with the REST services which have the same basic WSDL as the external services (and EMBRACE) webservices but use different methods. An example if the lack of a clear definition of the output style from RESTful services (XML, HTML, raw text) where service invocation is limited to URL syntax. Maintaining metadata and WSDL files independently of the application provider is especially difficult.

3. Administration

No Items

4. Documentation and Training

4.1 Books

Jon has a template for editing the book text. The developers guide will concentrate on the core functions and data structures as there is no need to be a fully comprehensive reference in book form. The RelaxNG schema with standard tags is sufficient so far.

Jon has also written a DocBook guide with a list of tags that needs to be reviewed by the other authors. Only about 20% of the DocBook tags are used. Some of these are mandatory, others optional. Content text is now being added, mainly by adding markup to existing text and checking for consistency of style.

Peter will help by generating text from ACD, application, data structure and function documentation with DocBook markup. Many of the new required tags already exist (e.g. short documentation strings for functions) but are not yet fully populated. We can generate a lot of them from existing descriptions, and extend existing documentation for some new tags.

4.3 Loan machines

Alan has received Intel Mac loan hardware and may also test under the "leopard" developers version of the operating system.

5. User queries and answers

The list was reviewed. Everyone should review the current list at sf.net/projects/emboss/ and close those already dealt with. All requests have been assigned (mostly to Peter).

EBI support are looking into an issue with the SRS server which has become limited to 30 entries when returning the text of sequence entries.

All new issues were considered to be resolved.

6. AOB

6.1 EBI Search Services

Rodrigo reported on possible new ways for EMBOSS to use EBI data search services.

6.2 EMBL subentry indexing and retrieval

Rodrigo proposed extending EMBOSS data indexing to be able to handle nucleotide sequence data for whole chromosomes as a database of features, and to index at the feature level as an entry. It would also be very useful to be able to extract features with overhangs at each end. Ensembl can do some of this but has a high overhead for users. The next release of EMBL will include over a very large number of entries with length over 1 million base pairs, including expanded CON division entries and third party annotations of whole genomes and chromosomes.

Sequence entries would need some naming convention. The Protein ID could be used for coding sequences, at least as an alternative name. Something has to be invented for most other features.

The USA syntax would need extending to request sequence beyond the end of a feature. Negative values are already used for offsets from the end. Perhaps explicit positive numbers could be used - but they would need special processing. Most simply this could use new associated qualifiers and new attributes of the AjPSeqin datatype which ACD processing could populate from the USA. Peter will investigate further possibilities.

7. Date Of Next Meeting

The next meeting is on Monday 13th November.

It was agreed to keep the same meeting schedule for next year.