|
EMBOSS: Project Meeting (Mon 5th January 09)
|
Attendees
EBI:
Peter Rice,
Alan Bleasby,
Mahmut Uludag
Sanger:
Visitors:
Apologies:
Jon Ison
1. Minutes of the last meeting
Minutes of the meeting of 8th December 2008 are
here.
2. Software Development
2.1 Applications
Peter reported an interesting set of requests from a user of
revseq. The user asked why the case of the original GenBank
sequence was not preserved. This is because when EMBL and GenBank
changed case some years ago we put in a conversion to upper case in
the GenBank and EMBL parsers. It was agreed that this should be
removed for the next release. The user also asked whether the output
filename could be derived from the input filename rather than the
input sequence ID. Peter proposed adding an alternative output
file name default in ACD processing, with a command line option and an
environment variable to toggle between the standard and alternative
behaviours. Some interfaces could benefit from the ability to define
the output filename prefix. Thirdly, the user asked whether
revseq could update the description of the output sequence so
that it is clear that it is a reverse-complement of the original. it
was agreed that this could be done (and noted that the added tag needs
to be removed if the sequence is reverse complemented again).
Peter is also working on improvements to the display of
translated sequences by showseq and sixpack suggested by
a user. Sequence ranges (exons) now display in the correct reading
frame (the current behaviour is to force them into frame
1). Presentation of three letter amino acid codes in the reverse
direction was fixed. Some tests remain before committing the new code.
2.2 Libraries
Peter has been working through outstanding bug reports in the
trackers on SourceForge.
For phylogenetic applications (PHYLIPNEW) reading distance matrix
files failed for some formats written by other applications. Distance
matrix input now works for multiple matrices in square,
upper-triangular and lower-triangular formats.
Various problems in Stockholm format (used by HMMER and PFAM) have
been identified and resolved.
2.3 SoapLab
Typed services were using the JAX-WS stack. Mahmut now has
typed services sharing the same web application as Axis services and
deployed on the development server. Improvements to the XSD and WSDL
definitions include short application descriptions in the WSDL
file. Most methods are now implemented, except 'waitfor' and
'getresults' for partial results. The current namespace in the result
XML is broken but it easy to fix.
Documentation is being updated with a new introduction, a description
of EMBRACE-compliance, other SOAP services for EMBOSS (Jemboss,
WsEmboss).
A beta release of the typed services will be made available soon.
Peter has added an acdxsd utility with stubs to generate
XSD sections for each input and output datatype and other
qualifiers. Sequences can use a general include. Values need to be
determined for required qualifiers. For outputs we will assume SoapLab
will enforce a standard format. acdxsd will be Soaplab-specific
unless other users request alternatives. A new output format of DASGFF
could be used for features and reports. We need to decide on a similar
format for sequence and alignment outputs from SoapLab.
3. Administration
Alan has set up a new EMBOSS wiki at Open Bio. He will work
through the administrator documentation. It is based on
MediaWiki. Candidate pages for the wiki include proposed new
applications and features, and a detailed set of GCG replacement
applications.
4. Documentation and Training
4.1 Books
Alan has been proofreading the latest drafts.
Peter has been working on autogenerating text sections.
4.2 Training
The proposed Madrid course dates are not yet known, but probably not
until Spring.
5. User queries and answers
See above for discussion of revseq features.
6. AOB
None
7. Date Of Next Meeting
The next meeting is on Monday 19th January.