EMBOSS: Project Meeting (Mon 4th April 11)
1. Minutes of the last meeting
Minutes of the meeting of 28th March 2011 are
2. Maintenance etc.
Peter reported that applications extractfeat
and coderet have been updated to support the new
feature/sub-feature data structure (see below). Work is still needed
Peter has implemented an ALIAS configuration for the
emboss.standard and emboss.default files. It will also be implemented
for server cachefiles to allow server-specific aliases. The alias name
can be used in place of a real database name. Error messages helpfully
report the original USA in most cases and so far do not need changing.
Peter has implemented the GFF3 "ID" and "Parent" tags as
sub-features. Code to read EMBL and GFF3 now uses a sub-feature
structure in an AjPFeature object. Other input and output formats ave
The poorly formatted GFF3 output from previous EMBOSS releases can
still be read using the older parsers but may not fully understand the
new subfeature structure.
Mahmut is looking for examples of DASGFF feature
outputs. The dassources test application can return raw DAS
results which can be searched for feature tags. The older DAS 1.5
standard used "Group" tags to define a feature hierarchy. Under DAS
1.6 this should be replaced by "Parent" and "Part" tags but no
examples have been found to date from current DAS servers.
Mahmut asked whether failure to read a sequence of a feature
table could be set to produce messages explaining the reason for the
failure. Peter explained that in many cases code may be
testing formats, or testing database retrieval in whichdb, and will
need to suppress any error messages. We could set a feature or
sequence internal error message for the last known error, but this
will need to be done for each "return ajFalse" in the input code to
avoid unwanted retrieval of an older message.
Alan has been investigating ways to implement the QA testing
for mEMBOSS. There are obvious differences under Windows. QA tests
that preset environment variables will need to be modified, with a
suggested field for variable settings that can be interpreted
differently in Unix and Windows scripts. Pre- and post- processed
commands using Unix commands such as cp or rm will need either a
Windows-specific version or a prefix with COPY or REMOVE that can be
interpreted on each platform. The latter option was
preferred. Relative paths need to be converted to absolute paths under
Windows. Peter also noted that some of the regular expressions
need to be modified to allow Windows output to pass (examples include
reported commands and relative paths).
Peter will try installing Perl under Windows to test
modifications to the qatest.pl script. Alan has ,looked at
"Strawberry Perl". Michael suggested "Active Perl" as an
alternative, as it is the most used for Ensembl on Windows.
Mahmut has updated SoapLab services for the London move and
tested using the EMBASSY and EMBOSS QA tests. One application failed
when using the URL access method for database retrieval but has been
corrected by redefining the database using dbfetch. The services are
running a very old EMBOSS version (2.9.0). The mwcontam service
failed as SoapLab and EMBOSS use different separators for values. This
will be fixed in SoapLab.
The most recent applications are now working, but as initial
'hacks'. The handling of new data types could be improved.
Mahmut has looked into the Junit testing used
by Artemis. The current Junit test simply checks a basic
3. New developments
3.1 Access methods
Alan continues to work on BioMart caches.
Michael continues to work on the updates to support Ensembl release 61.
Jon and Matus Kalas attended a Software Ontology (SWO) meeting
last week in Manchester. The scope for SWO was agreed, and will
include EDAM data and operations terms, plus new terms required by SWO
itself. The SWO ontology will be maintained in OWL.
Jon is exploring ways for EDAM to be used by the BioCatalogue
Jon and Matus are considering publication options for EDAM.
The next release of EDAM will include the concept of "core datatypes"
to distinguish datatype-related information from general
parameters. The release will also include a separate branch (name
space) for identifiers.
Alan reported that the Open-Bio anonymous CVS server is rebuilt
and now available again. The rsync server needs to be announced to
5. Documentation and Training
Peter has updated a few obviously outdated pages on the
website, including the grant number.
Jon will put up a private copy of the new website generated
form the book texts. for testing. The link will be through his home
Peter noted the need to check for URLs referenced in the books to
make sure they are available as URLs or redirects from the web
server at emboss.open-bio.org
6. User queries and answers
Mahmut attended a next-generation sequencing meeting in Cambridge last
Peter will contact the Advisory Board with an update on progress.
8. Date Of Next Meeting
The next EMBOSS meeting will be on Monday 11th April.