EMBOSS: Project Meeting (Mon 6th Jun 11)
Minutes of the meeting of 23rd May 2011 are here.
seqxref reports the cross-references.
seqxrefget adds an extra step to identify the database type and generate a retrieval command using entret, textget, etc.
A possible extension to these applications is to add SO (sequence ontology) references for the feature types.
The URL datatype is to report URLS from DRCAT which do not resolve to readable text when called from within EMBOSS.
The variation datatype if to support reading variation data from Ensembl or perhaps from other file formats. Michael suggested dbSNP entries, though their format is subject to change. Mahmut has made GFF3 output more flexible. Tag name validation is not required so EMBOSS can read any tag and write them on output. Tags for more limited formats (for example, EMBL) can be converted on output but stored with their original names internally.
The extra semi colon at the end of the tag field has been removed. GFF validation utilities objected to the last tag ending with a semi colon.
Peter noted that the offset syntax filename%1234 is not accepted in mEMBOSS. It is rarely used, but should be supported by an alternative syntax. Suggestions are welcome.
Peter has revised the query language for lists of identifiers. These lists now need a delimiter, either '|' (OR) or the equivalent ',' to simplify parsing. Spaces made testing for operators tricky while allowing spaces in the query syntax for keywords and other searched (where an underscore is allowed as an equivalent).
Peter has updated emboss index access to store matched by their file number and offset rather than by identifier. Accession number searches were storing the accession number, but searches by 'id' and by secondary fields were storing the primary ID. By storing the AjPBtid as the table key the accession and id queries can be safely combined.
Michael is working on Ensembl 62 updates, especially to variation data.
Michael commented that the applications which generate server cache files need a standard naming and user interface. Peter will make a suggestion at the next meeting.
Ensembl access needs improvement to make use of field names and the new query language operators.
Peter has resolved some 300 compiler warnings for Vienna2 to give clean build results.
The list of new applications on the Wiki has been updated.
OBO relations are transitive - relations are inherited by all descendants or a term.
Final checks are being made using Obo-Edit.
Once committed, Peter will check the relations in all ACD files.
mEMBOSS is now included in the standard QA testing script, for both the Visual Studio build and the installed mEMBOSS. embossversion is used to find the install and testing directories
Peter has added new files to the .cvsignore lists.