EMBOSS: Project Meeting (Monday 16th April 2007)

EMBOSS: Project Meeting (Mon 16th April 2007)

Attendees

EBI: Peter Rice, Alan Bleasby, Jon Ison, Mahmut Uludag
Sanger:
Visitors:
Apologies: Shaun McGlinchey, Martin Senger, Rodrigo Lopez, Tim Carver

1. Minutes of the last meeting

Minutes of the meeting of 2nd April 2007 are here.

2. Software Development

2.1 Applications

Alan has a temporary fix for the problem with file pointers in edialign. It appears that FILE* pointers are local to the library DLL. More testing is needed to see whether this problem is more general.

2.2 plplot Library

Alan is looking into building the new EMBOSS plplot library code on Windows. There are errors for functions not found and some issues with threading.

2.3 Other Libraries

Peter is ready to commit changes to read and write full EMBL and GenBank entries.

Peter has (again) fixed the efficiency of setting long string values. Testing showed that very long sequences were making far too many extensions and copies of strings. The string library functions are now fixed to correctly double the length of long strings (the code added for this was being bypassed). Reading EMBL and GenBank formats now sets the sequence string to be the length expected from the first line of the entry, avoiding the need to expand the sequence string.

2.4 Web services

Mahmut is working with Martin on some technical details of the use of Maven in Taverna, and on designs for plugins using typed services for EBI's fasta and interproscan services.

2.5 Other development

Peter has updated all the test databases with a script that picks up the latest versions of all entries. Some entries have been removed, for example 3 related sequences are now in a single entry in EMBL/GenBank. Wildcard searches in the program examples depended on sequence IDs which are no longer available for EMBL entries. These are replaced by accession number wildcards or by list files.

Peter would like to review the list of test databases. We currently include wormpep as an example FASTA format database which is becoming dated. We should probably include REFSEQ. We need to generate GCG database files for the dbigcg and dbxgcg tests. The University of Cambridge can help generate the files.

3. Administration

3.1 Release 4.1.0

Alan noted that we still need a patch file for the first fixes.

4. Documentation and Training

4.1 Books

Jon has checked through the ACD datatype and attribute documentation. He will now concentrate on ACD syntax, ACD files and commandline behaviour which will form two chapters and an appendix section.

5. User queries and answers

No new issues.

6. AOB

None.

7. Date Of Next Meeting

The next meeting is on Monday 30th April.