EMBOSS: Project Meeting (Mon 7th September 09) |
Peter has added new file output functions to write strings and characters without first copying to an internal buffer. This saves a significant proportion of the time needed to write a large short read sequence file. The functions that write newlines need careful testing on other systems as we write binary files and need to manage the newline formats ourselves. Alan requested that any system-specific functions needed should go into ajsys.c
About 50 new EDAM terms were needed, e.g. for missing functions to match the existing EDAM fields.
Some further cleanup of EDAM was completed at the same time.
Jon has added 80 potential applications to the wiki to cover EDAM functions that were not used in annotating the existing ACD files.
Jon has committed the ACD files for applications that will handle lists of databases (finddb, idtell, seqxrefall, showdball).
Michael Schuster has contributed an updated code library for SQL access to Ensembl servers. Alan will review the code.
Peter noted that there is a BioRuby interface to BioMart that can be used as a template for access of BioMart resources by EMBOSS.
Peter has a description of Roche SFF next generation sequencing data file formats from Peter Cock. These will be added to the Wiki.
Peter is looking into the Open Bio project OBDA which standardizes the way projects access indexed flatfiles, remote servers and BioSQL data resources. It may be interesting for BiOSQL. Details will be added to the Wiki.
Mahmut is looking into new applications for trimming next generation sequence data. Possibilities include adding pattern matching methods to vectorstrip.
Peter will add wiki notes on fast string matching methods for genome-scale string searches that were described at ISMB in Stockholm. The searches were extremely fast - a few seconds for substring matches.
Alan and Peter will have a meeting with the EBI systems group later today to discuss server configurations and pricing.
When the new systems arrive, the old workstations will remain in service running Windows XP. Alan will request new static addresses so that we have the option of installing the new workstations under new names. Two of the old mimservers are dead so their numbers could be reassigned.
Alan reported that the one of backup drives is showing problems and the drive on emboss5 should be replaced. Peter will order a replacement, and other sundry items (3 SATA cables and 2 replacement optical mice).
Alan will contact OBF through their system tickets to restart the copying of CVS commits to the public server.
Peter has decided to refactor all function names in the ajgraph source file because the original naming (from before release 1.0.0) does not make sense in the context of the books. Functions with similar ajGraph names may handle AjPGraph objects, handle plplot data objects, or simply change or use the plplot internals.
Jon and Peter will go to the EMBRACE ontologies meeting in Amsterdam in mid-September.
Peter will be at a large data meeting in Beijing in October.