EMBOSS: Project Meeting (Mon 16th May 11) |
File paths are converted to backslashes on output with new functions ajFileGetPrintNameC and S. Directories and filenames containing forwards slashes will not work in most cases in mEMBOSS. This allows emboss.standard and the server cache files to work unchanged in mEMBOSS.
Pre and post processing commands work with few changes. Environment variables use the SET preprocessing command and ignore the EXPORT command. Commands cp, 'rm -rf' and rm are converted to their Windows equivalent when written to the qatest.bat script file. The copy command with a wildcard always writes to the screen via standard error but this does not interfere with the test results.
Emma standard output from clustalw is not visible when run as a qa test but works as expected from the command line. Alan is investigating.
The variation adaptor has a new subclass to support the 1000 genomes project. There are new general iterator methods to avoid 1 million or more objects being created in Perl. Memory management in C is less expensive so implementation in C is not urgent.
The server cache file is written by an application showensembl which reduces the number of SQL queries from 560 to 18 to return a sequence object. The DBALIAS attributes are generated and are working well for ensembl. Aliases include all names with underscores, one of which is the NCBI taxon. All names and aliases are in lower case.
Ensembl identifiers include the exon id, transcript id and translation id. They may also include identifiers generated from the gene stable id.
Havana data works well in EMBOSS, with plotorf able to find the longest ORF in a Havana conserved region.
The improved efficiency of the C API could be useful for variant effect [predictors. These now account for up to 75% of the ensembl hits.
Extensions to servertell and dbtell could be useful. These could include a way to link related databases through some additional attributes in their cache file definitions, for example Ensembl databases reporting different sequence object types from the same species.
The replacement backup drive has been installed for emboss5.