|
EMBOSS: Project Meeting (Mon 14th February 11)
|
Attendees
EBI:
Peter Rice,
Mahmut Uludag,
Jon Ison,
Michael Schuster
Visitors:
Apologies:
Alan Bleasby,
1. Minutes of the last meeting
Minutes of the meeting of 7th February 2011 are
here.
2. Maintenance etc.
2.1 Applications
Peter has fixed a bug in showdb which was reporting
taxonomy databases twice. The code has been rewritten to generally
iterate through the database types.
Peter will look into a request for applications to allow
a -bothstrands option to automatically process both strands of
an input sequence. This will require changes to the application logic
on a per-program basis.
Peter will look into handling small word sized in diffseq.
2.2 Libraries
Peter proposed extending code for handling AjPTable
objects to automatically merge tables with common key and value
structures. This can be made very efficient by first resizing the
tables to have the same hash array size. Table merging will greatly
simplify the code to handle the new query language operations.
Peter will look into a user request from EBI External Services
to support the fastm sequence format variant which stores sets
of short protein sequence fragments.
Peter will look into extending SAM and BAM formats to support features.
2.3 Jemboss
Mahmut will look into a user query on string values from ACD in
Jemboss should be converted to numeric data. It is likely that the
current handling is correct.
2.4 Other
Peter is collecting fixes for a patch release.
3. New developments
3.1 EMBOSS configuration
Peter and Mahmut will attend a DAS workshop next month
and give a presentation on the DAS client implementation in EMBOSS.
3.2 Ensembl access
Michael reported that the ensembl registry code now has
datatypes for core, variation, etc. and needs adaptors defined for
each datatype.
Database adaptors use internal object caches, with stable identifiers
and several aliases for each database.
To select the correct organism it would help to have a set of patterns
for the organism-specific Ensembl identifiers. The aim is to allow
'ensembl:id' to automatically detect a suitable database to match the
ID.
3.4 EDAM
Jon will attend an EDAM/BioXSD workshop in Amsterdam with
Matus. New format terms have been added. The workshop will consider
adding regular expressions for values associated with data
identifier terms.
3.5 DRCAT
Peter has added 'Taxon' records for all entries giving the NCBI
taxid and name for the most general taxon covered by the data
resource. General resources are classified as '1 all'
Peter has renamed the 'tax-nam' field to 'tax-tax' to reuse the
field name most popular for SRS servers to describe the taxon
name. The index now covers the scientific name, genbank name and
common name.
4. Administration
Peter noted the brief report from the EMBL SAC review of EBI
services.
5. Documentation and Training
5.1 Books
Jon has sent in the corrections to the Developer's Guide, and
updated the XML source files for this and the Administrator's Guide.
Peter will send the User's Guide corrections today.
5.1 Website
Peter noted recent spam posted to the EMBOSS wiki, and asked
for help in monitoring recent changes and removing any further spam.
6. User queries and answers
Jon noted a user query on mapping circular features
in cirdna where the labelling is hard to see. Peter will
investigate.
7. AOB
Peter reported on the recent DebianMed package developers
meeting in Germany.
Peter will go the the meeting of a new COST consortium on
next-generation sequence analysis in March.
8. Date Of Next Meeting
The next EMBOSS meeting will be on Monday 21st February. Peter will be away.