EMBOSS: Project Meeting (Mon 24th August 09)
Mahmut is looking into retention of quality scores and their use in vectorstrip. The current algorithm is not well suited to detecting adaptors in the 3' end of short read data.
Peter has profiled reading and writing of fastq-sanger format short read data and identified a number of speedups in I/O processing and avoiding unnecessary string processing. The AjPSeq sequence structure has become bloated. The next code commit will include changes to the initialisation, copying, deletion and use of AjPSeq to make each attribute optional. Some of the more heavyweight attributes (e.g. lists of citations and cross-references) are only relevant for certain input formats.
In collaboration with the other Open-bio projects, most notably BioPython, we now have a set of standard defective FASTQ files to use for testing diagnostic warning and error messages.
Peter has arranged to get together with EBI systems after the next meeting to discuss EMBOSS server options.
Peter has copied standard databases (embl, uniprot, refseq) and some software (galaxy) on to the new RAID disks. Software installations will wait until the new workstations are delivered and installed. Galaxy is a standalone system in python. Data is in /shared/data and packages in /shared/software on RAID.
Peter will set up a doodle poll for possible SAB meeting dates.
Jon will leave stylesheet editing until we have comments back from the publishers.