|
EMBOSS: Project Meeting (Fri 12th November 2004)
|
Attendees
RFCGR/HGMP:
Alan Bleasby,
Jon Ison,
Gary Williams,
Lion:
Sanger:
Tim Carver
EBI:
Peter Rice,
Lisa Mullan,
Amandine Schuurman
Visitors:
Apologies:
1. Minutes of the last meeting
Minutes of the meeting of 1st October 2004 are
here
2. Software Development
2.1 ACD ontology
Peter described Amandine's student project. Amandine is a
visitor from Namur in Belgium. She has been annotating ACD files with
the relations between output files (and their contents) and input
files and parameters.
Cases where the relationships are complex - for example, dependencies
on selections and lists, or on boolean options, have been clearly
annotated.
The results will be translated into ACD markup in a future release,
and will also indicate (helped by acdvalid analysis) which
applications are candidates for splitting into two or more
simply-defined programs.
2.2 Developers Course Requests
Alan reported the following requests from attendees on this
month's developers' course:
- An option to only read part of a sequence rather than reading all
and selecting a range. Possibly too hard to implement for most formats
and access methods.
- Genbank feature tables - some have the wrong translation in the
CDS where EMBL is right, for example cases with complement(join())
which are incorrectly reverse-complemented before being joined. Need
to find examples and check what EMBOSS gives as a translation, hen
test whether the validation is too time-consuming.
- add a minimum number of sequences for seqset input - apparently
not enforced.
- ranges in ACD need a minimum and maximum value.
- The file line-reading functions need a delimiter for end-of-line,
for example the use of control-A in FASTA format dbEST files from
NCBI. Not clear exactly how this could be implemented, as it is
really format-dependent.
- translation - need a way for users to specify a translation table
to use by default. Perhaps as a hardcoded name to look for, or as a
variable.
- debug calls are reporting the wrong number of frees in the exit
calls for some data structures.
- seqset input generates a memory leak in valgrind - need more
valgrind tests for simple inputs.
- a request from a Sanger user - to be able to read annotation
from blast output and the source entries.
2.3 Domainatrix
Jon has added a function for all versus all alignments in
Nucleus.
2.4 Access Methods
Peter has added a database access method for Seqhound services
from BluePrint.
Peter has also added a similar database access method for
Entrez utilities from NCBI.
Both are provided as beta versions in CVS.
2.5 Genetic Codes
Gary has checked and updated the genetic code files - NCBI had
updated some of the minor start codons.
2.6 Jemboss
Tim has updated libraries for apache axis to version 1.4.2, to
work with java 1.5.
2.7 Diverge request
Gary reported a request for a program like GCG's diverge.
3. Administration
No items.
4. Documentation & Training
4.1 PDB parsing paper
Jon has a solution for mmCIF output by using a Brookhaven
utility to convert to PDB. He will look into providing mmCIF output
format, and adding residue-level descriptions.
4.2 ASEAN Workshop
Peter and Lisa reported on the ASEAN course at CBI in Beijing.
5. User queries and answers
Gary's current list reviewed. Most new issues were already answered.
6. AOB
7. Date Of Next Meeting
Next meeting at 9.30 on January 14th, 2005,