If anything below interests you, please volunteer to work on it!. If you do decide to start work on these areas, please let the mailing list know first - many people may wish to collaborate with you or suggest easier ways of doing things.
If you've an idea for a library or an application that isn't on this list, please let the mailing list know and it will be added to this web page, even if you haven't time to work on it yourself.
IMPORTANT NOTE If you've a request for a new feature for an existing EMBOSS application please use our site on sourceforge.
Name | Priority | Status | Description | Comments | Code or documentation |
---|---|---|---|---|---|
BLAST wrapper | High | Inactive | Wrapper to the BLAST suite of programs | Wrappers to BLAST and FASTA have often been requested. Probably individual applications for BLASTP, BLASTN etc. Could code from the wEMBOSS BLAST wrapper be used? | |
Wrapper to other database search programs | High | Inactive | Wrapper to the BLAST suite of programs | Probably individual applications for BLASTP, BLASTN etc. | |
fasta wrapper | High | Inactive | Integration of Bill Pearson's FASTA, TFASTA, etc as EMBASSY wrapper. | Wrappers to BLAST and FASTA have often been requested. Bill Pearson mentioned he would like to do this, but it needs some way for sequences to be fetched again (e.g. saving file number and offset for 'any' sequence access method). The code is part of the way there. Could code from the wEMBOSS BLAST wrapper be used? | |
rnafolding | High | Inactive | Integration of external RNA folding applications. | The Zuker package may still be about the best. | |
hitmatch | Medium | Inactive | Replacement for EGCG's equickmatch, using blastn output | Needs a blast or fasta output parser. Should read in the query and database sequences, and perform a full NW or SW alignment, word-based if possible as they should be near-perfect matches. The aim is to report only those matches above a given threshold, and report the full alignments. If possible, with only the *differences* marked instead of the similarities. No documentation for equickmatch is available. |
|
alignutils | Medium | Inactive | Sequence alignment utilities, to replace EGCG sortconsensus. | Could implement various alignment site-scoring algorithms. | alignutils documentation is available. |
dodayhoffstat | Medium | Inactive | Replacement for EGCG's dodayhoffstat | Relatively easy to do. | dodayhoffstat documentation is available. |
mapplot | Medium | Inactive | For displaying restriction plots. | mapplot was specifically requested. | |
dottie | Medium | Inactive | A general interactive dot plot application. | Could use what's available, e.g. Erik Sonnhammer's dotter. A new implementation would requires interactive graphics. | |
nucstats | Medium | Inactive | To report nucleic acid "vital statistics", e.g. ACGT composition etc. | See pepstats for ideas. | |
plasmid drawing | Medium | Inactive | To draw plasmids with restriction sites. | As a replacement for MapPlot from GCG. Perhaps modify cirdna? See TACG (http://tacg.sourceforge.net/) for ideas. | |
fastacheck | Low - maybe remove | Inactive | Replacement for EGCG's fastacheck | Simple to do, but now FASTA has better statistics probably not needed. Functionality to read FASTA statistics and select hits might be useful regardless though. | fastacheck documentation is available. |
gapframe | Low | Inactive | Adjust gap positions to be only at codon boundaries in a DNA sequence with known CDS position(s). | Easy to do but requirement might be too low to justify it. | |
homologies | Medium | Inactive | Table of the pairwise distances of aligned sequences. The EMBASSY allversusall application does this and could be moved to EMBOSS. | ||
Feature utilities | Medium | Inactive | Operation on a feature table file to extract selected features to another file. | Should be turned into a quite extensive set of library functions. | |
Cluster | Low | Inactive | This program is still in the 'test' set of programs. | Sanger stopped using it therefore probably not needed. Easiest route to get clustering functionality at application level might be to use e.g. SANBI's stuff (but what about license?) AJAX clustering routines would be useful. | |
ALIEN | Medium | Inactive | Multiple alignment program. | Many multiple alignment programs are available and could be wrapped. | |
Gene ID programs | Medium | Inactive | Would be useful. | Is there non-commercial code for this? | |
genetrans | Medium | Inactive | Replacement for EGCG's genetrans | Functionality possibly redundant with existing EMBOSS apps though - check ! | genetrans documentation is available. |
ALIEN wrapper | Medium | Inactive | Support for 3rd party Multiple alignment program ALIEN | Requested via EMBOSS mailing list. Many multiple alignment programs are available and could be wrapped. ALIEN was specifically requested, but there are many other popular ones, e.g. TCOFFEE. | |
acdquery | Medium | Inactive | Application to return ACD attributes e.g. sequence.length | Arising from Marc Colet meeting. This would help in interpreting an ACD file. Much of the code exists; adapt seqinfo, acdtrace or entrails? . Must decide what to do. | |
MFOLD equivalent | Medium | Inactive | Equivalent to MFOLD for RNA secondary structure prediction | Requested via EMBOSS mailing list. GCG has this, we don't! No details for this - but it's been repeatedly requested. | |
snplocator | Medium | Inactive | Application to locate SNPs in coding sequences | Requested via EMBOSS mailing list. No details - but it was asked for. | |
Feature display | Medium | Inactive | Graphical display of selected features from a feature table. | Possible with plplot but probably better with a new graphics library. | Notes are available. |
Application for codon usage / composition bias | Medium | Inactive | Application for codon usage / composition bias | Requested via EMBOSS mailing list. | Notes are available. |
polyatails | Medium | Inactive | Searches in a cDNA, the existence of any of these PolyA signals in the context of the poly A tail., using different regular expressions. | Coral del Val from the Cancer Research Centre (Heidelberg) has submitted code. It is bases on the paper of Beaudoinget al., Genome Research vol. 10 1001-10010. | Notes are available. ACD file is available. C source codeis available. |
showdata | Medium | Inactive | For showing codon usage tables: | Requested via EMBOSS mailing list. | Notes are available. |
backtranambig | Medium | Inactive | back translate a protein sequence to ambiguous codons. | Requested via EMBOSS mailing list. | Notes are available. |
alignfromhsp | Medium | Inactive | Build alignment from BLAST HSP | Requested via EMBOSS mailing list. | Notes are available. |
jess | Medium | Inactive | Functional site detection in protein structures | Requested via conversation at EBI.From Thornton group into EMBOSS,perhaps as an EMBASSY package? | Notes are available. Packaged code is available. |
plotsimilarity | Medium | Inactive | Requested via EMBOSS mailing list. | Notes are available. | |
pscan replacement | Medium | Inactive | A replacement to pscan | Requested by Dave Judge via Alan. Retire pscan. Maybe replace with wrapper to interproscan? | |
nucstats | Medium | Inactive | A "nucstats" or some such, to report nucleic acid "vital statistics", e.g. ACGT composition etc. (see pepstats). | Requested via EMBOSS mailing list via AJB. | |
plasmiddraw | Medium | Inactive | Replacement to MapPlot from GCG to draw plasmids with restriction sites. | Requested via EMBOSS mailing list. | Notes are available. |
Name | Priority | Status | Description | Comments | Code or documentation |
---|---|---|---|---|---|
AJAX code refactoring | High | Active | Function & parameter renaming and major documentation revision | In preparation for future EMBOSS developments. | |
neural-nets | Low | Inactive | Neural net routines and applications. | Lots of free packages; Jose Valverde was working on this and recommended using XNN in 2002. Might be better alternatives now. Neureka is available from ftp://ftp.ii.uib.no/pub/neureka/. Not a high priority. |
|
GAs | Low | Inactive | Genetic Algorithm routines and applications. |
Name | Priority | Status | Description | Comments | Code or documentation |
---|---|---|---|---|---|
Perl API | Medium | Inactive | Requested at the EMBOSS Industry Workshop 2006. An API to the applications could be generated automatically by the JACD tool that Jon Ison is working on. | ||
EMBOSS eclipse extensions | Medium | Inactive | Requested at the EMBOSS Industry Workshop 2006. The Eclipse package is very highly used in industry. Could look at bio-eclipse for ideas. | ||
R statistics | Medium | Inactive | Provide R statistics package as an EMBASSY package | Arising from Marc Colet meeting. R is powerful and widely used (e.g. for microarray analysis) but is difficult to use. An EMBASSY wrapper could improve usability. Claude Beazley (now at Sanger) is using R and might be interested in helping with this. Any alternatives? |
Name | Priority | Status | Description | Comments | Code or documentation |
---|---|---|---|---|---|
QA tests | Complete | Complete | QA application testing using set of standard outputs and simple parsing of the results. | Scripts to test output of expected results of EMBOSS programs. |