Codon Usage

 

EMBOSS Codon Usage Tables

EMBOSS reads codon usage data in several formats. We aim to cover all the popular data formats used by other packages and applications. If you need any new formats to be added please contact the EMBOSS team.

EMBOSS can automatically detect any of the formats listed below. For new formats we hope to continue to automatically identify them. As with sequence formats it is possible that some will be impossible to automatically detect. We reserve the possibility of making some formats available only if specified on the command line.

Current input formats

Input Format Comments
cut Codon usage table format. This is the format EMBOSS applications will write by default.
gcg GCG format, as used by the GCG (Wisconsin) sequence analysis package.
cutg The format used by the CUTG codon usage database.
cutgaa The format used by the CUTG codon usage database, with amino acid codes.
spsum CUTG database species summary format, with the raw numbers and with a named species and number of coding sequences in the header.
cherry Mike Cherry codonusage database file. Based on the GCG format with species name and number of codin sequences included in the header.
transterm TransTerm database file format. Read as GCG format.
codehop FHCRC codehop program codon usage file. FGrequencey data only, with additional information at the end of the file.
staden Staden package codon usage file with numbers only.
numstaden taden package codon usage file with numbers. A synonym for "staden" since recent releases of the Staden package adopted a single format.

Current output formats

Output Format Comments
cut Codon usage table format. This is the format EMBOSS applications will write by default.
gcg GCG format, as used by the GCG (Wisconsin) sequence analysis package.
cutg The format used by the CUTG codon usage database.
cutgaa The format used by the CUTG codon usage database, with amino acid codes.
spsum CUTG database species summary format, with the raw numbers and with a named species and number of coding sequences in the header.
cherry Mike Cherry codonusage database file. Based on the GCG format with species name and number of codin sequences included in the header.
transterm TransTerm database file format. Read as GCG format.
codehop FHCRC codehop program codon usage file. Frequency data only, with additional information at the end of the file.
staden Staden package codon usage file with numbers only.
numstaden taden package codon usage file with numbers. A synonym for "staden" since recent releases of the Staden package adopted a single format.


Last edited: 16 Dec 2008 - Peter Rice