EMBOSS: Project Meeting (Mon 15th February 10) |
Ideally, the onus should be on the developer to resolve these issues when the ACD file is first written. Peter proposed that ACD should first check whether the either range value is calculated, and if so require the behaviour to be defined through trueminimum or some attribute to fail if the value is out of range. Peter pointed out that a failure message is hard to derive automatically as calculated values depend on other ACD qualifier values. The proposed solution is to define a message to be issued if the ranges fail.
There is also a need for these functions to process single delimiters, for example in tab-separated values where missing values can be represented by consecutive tabs. These are used in BioMart output, and in SAM/BAM sequence data formats.
Alan has rewritten jembossctl to accept matching quotes around single values. This will address the problem for authorized servers where a username and password are required.
Mahmut has a related fix that rewrites the command line as a string array to address the issue for local unauthorized servers.
Both solutions will be implemented.
To define a BioMart database we can use the URL for the host, port and path. The dataset name can be a dbalias attribute.
Queries can specify the BioMart software version to void future incompatibilities.
The results are usually tab separated values. Some attributes containing sequence data can be returned as FASTA. It is unclear how the header information is formatted.
Queries can be verified, can include tab-delimited column labels, return a count (number of matches), select unique results, and add a time stamp (and a [success] tag at the end).
Filters can include a list of values. Implementing this will require an extension to the EMBOSS USA syntax.
Database definitions will need a way to define the attributes to be returned. Peter will propose an extension to the database definition.
Jon will look into servers needed for the list of cross-references databases and other databases included in EDAM.
BioMOBY has a large number of datatypes defined. These can be clustered to a few hundred which were mostly already in EDAM. The remainder were added with cross-references to BioMOBY,
In discussions with other ontology experts at EBI, there are plans to make EDAM compatible with a new ontology covering tools, algorithms, data formats and some types.
The ONDEX project at Rothamsted is interested in expanding EDAM to cover their definitions of relationships between entities.
Two EMBRACE workshops are planned, in April and June , where further discussions will take place.
Jon is looking into packages for ontology management, including protege and one commercial products.
Alan has reminded the systems group about a testbed for database indexing over the network. No reply yet.
Alan has reminded Apple that we would like an extension of the machine loan, and to remind them that we are awaiting more information about the pickup of the machines to be returned. Also no reply yet.
Jon has made minor edits to correct invalid XML. In the interfaces and links sections, Pise has been superseded by Mobyle and Galaxy has been added.
Stylesheets are now relatively simple to manage, with various presentational issues resolved.
A new mock homepage is available for review.