EMBOSS: Project Meeting (Apr 26th 1999)


Attendees

Sanger Centre: Peter Rice, Ian Longden, Richard Bruskiewich
HGMP: Alan Bleasby, Gary Williams, Val Curwen, Thon de Boer, Mark Faller, Sinead O'Leary
EBI:
Apologies: Rodrigo Lopez, Martin Senger

1. Matters Arising

2. General progress on release 0.0.4

Peter has checked through Dave Judge's course manual for the EMBnet course in Cuba where he would like to use EMBOSS and Staden as the sequence analysis packages. Most of the GCG examples were to show how GCG works, so any EMBOSS applications can substitute. He did request local and global alignment programs. Peter will look for suitable reusable code.

Guy Slater at HGMP has some EST clustering and frame finding code. Peter will check with him on whether this can be included in EMBOSS.

Thon is working on updates to the ACD documentation, and will also work on documenting the emboss defaults settings and database definitions.

Richard has the CVS tree set up, and plans to work on the GFF code in June/July.

Alan is updating sequence positions in sigcleave (etc.). He has implemented and integrated a number of string pattern matching algorithms. Some code for ranges and mismatches is still needed. Patterns are preprocessed.

Val is working on sequence pretty output formats and looking into restriction enzyme patterns.

Mark if working on clustalw and pepinfo.

Ian has almost finished prettyplot. There are some minor problems with the PLplot graphics library in portrait mode.

Gary has translate using the NCBI tables except for table zero, and is working on utilities to list available databases and programs. Gary will also work on the user documentation.

Sinead is working on motif searching and primer programs.

Database formats were discussed. HGMP would like to index database subsets that they use with FASTA and BLAST. Peter suggested indexing the blast2 index files (he has the format from NCBI). There may be other ways to index whole files and keep using the Staden/EMBL-CD indices.

Database supersets are also needed. Peter will think about this.

Gary suggested looking into the EMBL "CON" division for whole genomes.

3. Library documentation

Peter is working thorugh the string library documentation and will have this in a new format with examples before the next meeting.

4. Next meeting

Next Monday is a holiday, so Monday 10th May, usual time and place.
Peter Rice, Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton Hall, Cambridge, CB10 1SA, UK.