Codon Usage |
EMBOSS reads codon usage data in several formats. We aim to cover all the popular data formats used by other packages and applications. If you need any new formats to be added please contact the EMBOSS team.
EMBOSS can automatically detect any of the formats listed below. For new formats we hope to continue to automatically identify them. As with sequence formats it is possible that some will be impossible to automatically detect. We reserve the possibility of making some formats available only if specified on the command line.
Input Format | Comments |
---|---|
cut | Codon usage table format. This is the format EMBOSS applications will write by default. |
gcg | GCG format, as used by the GCG (Wisconsin) sequence analysis package. |
cutg | The format used by the CUTG codon usage database. |
cutgaa | The format used by the CUTG codon usage database, with amino acid codes. |
spsum | CUTG database species summary format, with the raw numbers and with a named species and number of coding sequences in the header. |
cherry | Mike Cherry codonusage database file. Based on the GCG format with species name and number of codin sequences included in the header. |
transterm | TransTerm database file format. Read as GCG format. |
codehop | FHCRC codehop program codon usage file. FGrequencey data only, with additional information at the end of the file. |
staden | Staden package codon usage file with numbers only. |
numstaden | taden package codon usage file with numbers. A synonym for "staden" since recent releases of the Staden package adopted a single format. |
Output Format | Comments |
---|---|
cut | Codon usage table format. This is the format EMBOSS applications will write by default. |
gcg | GCG format, as used by the GCG (Wisconsin) sequence analysis package. |
cutg | The format used by the CUTG codon usage database. |
cutgaa | The format used by the CUTG codon usage database, with amino acid codes. |
spsum | CUTG database species summary format, with the raw numbers and with a named species and number of coding sequences in the header. |
cherry | Mike Cherry codonusage database file. Based on the GCG format with species name and number of codin sequences included in the header. |
transterm | TransTerm database file format. Read as GCG format. |
codehop | FHCRC codehop program codon usage file. Frequency data only, with additional information at the end of the file. |
staden | Staden package codon usage file with numbers only. |
numstaden | taden package codon usage file with numbers. A synonym for "staden" since recent releases of the Staden package adopted a single format. |