|
EMBOSS: Project Meeting (Mon 7th Nov 11)
|
Attendees
EBI:
Peter Rice,
Jon Ison,
Mahmut Uludag,
Michael Schuster
Visitors:
Apologies:
Alan Bleasby,
1. Minutes of the last meeting
The meetings on 24th and 31st October were cancelled as Peter was on vacation.
The minutes of the meeting on 17th October 2011 are
here.
2. Maintenance etc.
2.1 Applications
Peter has updated pepwheel following suggestions from
David Mathog to improve the handling of leucine zipper helices where
the residues overlap every 7 residues and rapidly reach the borders of
the plot.
Further changes have been suggested and will be considered for
implementation before the release. They require changes to the command
line interface which are best done with the other changes planned for
6.5.0 for prettyplot.
2.2 Libraries
Peter updated the handling of ambiguity codes by fuzznuc
so that ambiguity codes in the input sequence(s) can be
matched. Ambiguity codes are included in their own expansion. Escaping
the code with a backslash prevents expansion so that, for example,
'\S' will match only an S in the input sequences.
Michael has recoded Ensembl object adaptors to have one central
function to avoid redundant code. Features are mapped to slices.
Michael noted that a strndup function call in the latest
BAM code is specific to the GNU C library. It compiles on other
systems but fails at run time, for example on MacOSX using a BSD-style
C library.
Michael noted that the include statements need to start with
the ajdefine.h file to set up memcheck validation which is
specified in the config.h file. Peter will update the
source files.
Michael will soon commit changes to redefine domain headers
for EMBASSY-related utilities as enumerated types.
Michael noted some EMBASSY packages generate many warnings when
the configure files are synchronized with the main EMBOSS configure
and the devwarnings options are used. These include shadowing variable
names in emnu and type issues with
strlen. Peterproposed committing the revised configure files
and removing the more serious warnings. The configures could be
changed to turn off these warnings later if they are considered
harmless.
2.3. Other
3. New developments
3.1 Assemblies
Mahmut is improving BAM and SAM format support, preserving
header tags from alignment content. The updated code has been tested
on various public example files.
3.2 EDAM and DRCAT
Jon has cleaned up the operations branch in EDAM and will
provide a copy for validation and updating of EMBOSS ACD files. Some
cleanup in the data branch is still required.
3.3 Remote input
Peter has implemented HTTP and FTP URLs as valid queries for any
data type. The input string has to be treated as the whole URL. We can
implement new qualifiers like -iformat to specify the query and the
offset (we need an alternative to the %offset syntax on Windows in any
case) but it is not easy to find a suitable name for these qualifiers.
3.4 Variation data
Peter has implemented variationget to read and write VCF
variation data files in 4.0 and 4.1 formats. The code needs further
testing and will be committed in a few days.
Mahmut has looked in the the binary BCF format which uses a BAM
index for VCF and GFF files with indexing of intervals. The BCF code
in samtools is not complex and can be used as a model for this
format.
3.5 EDAM
4. Administration
All systems recovered after the weekend shutdown.
5. Documentation and Training
Jon still needs a final review of the new website, and we need
to seek permission to use the new book covers as a logo.
Peter will request EBI E-Learning accounts to set up test
courses. The primary contact will be Jon.
6. User queries and answers
All done.
7. AOB
None.
8. Date Of Next Meeting
The next EMBOSS meeting will be on Monday 7th November.