Ajax library general specifications

Aims

AJAX is a library of general functions to support, among others, the EMBOSS sequence analysis package. AJAX will be maintained by Alan Bleasby at Daresbury Laboratory, but may incorporate code from other public domain sources.

AJAX will be released under a "GNU" Library General Public Licence. All code contributions must be available for distribution under these terms.

This specification is based on decisions made during the design of the EMBOSS project. AJAX is free to interpret these aims in other ways.

All AJAX code will be written in ANSI standard C, and tested using (among others) the GNU gcc compiler.

AJAX routines will have names in the following form:

ajTypeName_subtypes
although "subtypes" will be used sparingly.

for example: ajStrNew ajStrNewS ajStrNewL for a new (default) string, a copied string and a new string of given length.

The first argument in all cases will be the object passed by reference with a variable name of "this", as for C++. If it can be chaged, it will be a pointer to a pointer to the object, and named "pthis" with an internal definition (and redefinitions when changed) of "this". "pthis/this" must be checked and deleted if it is being reused.

Object Classes

One class will be defined in a single source file with the name ajobj.c and an include file ajobj.h

Destructors

The destructor will be:
ajObjDel (ajObj *this)
and must be the first routine in the source file.

Additional detructors may be needed, e.g. for arrays of objects where the array size will be needed to delete all objects.

For pointers to objects:

Constructors

The default constructor will be:
ajObj *ajObjNew (void)
and must immediately follow the destructor(s) in the source file.

Additional constructors will have the argument types listed after "New" as for C++ contructor resolution.

Other Object generators must call a constructor and then make whatever changes they need.

For pointers to objects:

If any constructor fails, throw an error message.

Iterators

Where useful, provide iterator functions such that: All of these should be defined as functions, even for trivial cases where a macro may seem more efficient, to make maintenance easier.

Candidates would be Strings, Lists.

Classes

The following classes will be included in the first release of the ajax library. Others will be added as development proceeds.

Overview

EMBOSS applications use a command-line interface which is defined by a "command line definition". The current plan is for this to be specified in ICARUS, although alternatives such as a simple text file could be considered. Any information to be specified by the user must map to some known internal data structure, such as a file, a sequence, or an integer within a given range. Where new data items are required, there will be scope to define these through text strings until the data structures and the routines that handle them are clearly defined.

Certain core data structures will be modelled on those in the draft ANSI C++ standard and similar structures used in other packages. Example of these are:

Differences between ANSI C and (draft) ANSI C++ include:

Strings

SRS 5.1 has a tried and tested set of string handling routines that provides dynamic allocation of reference-counted strings.

The code is distributed with SRS in files $SRSSOU/strv.c and $SRSSOU/strv.h and documented in the source code.

Data objects are called "STRv" (virtual strings). They are pointers to "STRo" (string objects) that have the same properties as C++ strings but in addition are reference counted so that multiple occurrences can point to a single text string in memory, but any individual string can still be modified.

AJAX code will be similar to this, but will have an independently developed library to retain control over the future direction of the library.

Special issues are:

Examples:

/* allocation */
STRv StrNew();
Creates a new STRv containing the empty string (virtually copied) See STRv creators: StrNew, StrCpy, StrCpyS, StrTemp
STRv StrCpy(STRv src);
creates a new STRv containing a virtual copy of the given STRv See STRv creators: StrNew, StrCpy, StrCpyS, StrTemp
void StrDel(STRv val);
Deletes a STRv (a virtual copy of the value). If the given STRv was the only instance of the value in use, the value buffer is freed. Must be used whenever an assigned STRv goes out of scope
void StrGrow(STRv* val, int dlen);
Reserves space so that following operations on the STRv can modify it's value to a longer string. This is just an optimization, as STRv's grow automatically if the value lenght exceeds the allocated buffer See STRv optimizers: StrGrow, StrShrink
void StrShrink(STRv* val);
Optimizes the value buffer to have exactly the lenght needed to contain the string value See STRv optimizers: StrGrow, StrShrink
void StrClear(STRv* s);
if the STRv buffer is unique, resets the content of the STRv to "" without freeing the buffer
STRv StrSub(STRv s, int pos, int len);
creates a new STRv containing a substring of the given STRv See STRv creators: StrNew, StrCpy, StrCpyS, StrTemp
STRv StrLeft(STRv s, int pos);
creates a new STRv containing the characters of the given STRv on the left of the given position (excluding the one at the position) See STRv creators: StrNew, StrCpy, StrCpyS, StrTemp
STRv StrRight(STRv s, int pos);
creates a new STRv containing the characters of the given STRv on the right of the given position (including the one at the position) See STRv creators: StrNew, StrCpy, StrCpyS, StrTemp
/* operations */
void StrSet(STRv* dest, STRv src);
Assignes to one STRv the value of another STRv (virtual copy) The old value of the STRv is released See STRv modifiers: StrSet, StrIns, StrApp, StrAdd, StrCut To assign instead a char* value, see StrSetS
void StrIns(STRv* dest, STRv src);
Inserts in one STRv the value of another STRv at the beginning of the string. The destination STRv buffer is grown if needed. The old value of the STRv is released See STRv modifiers: StrSet, StrIns, StrApp, StrAdd, StrCut To insert instead a char* value, see StrInsS
void StrApp(STRv* dest, STRv src);
Appends to one STRv the value of another STRv. The destination STRv buffer is grown if needed. The old value of the STRv is released See STRv modifiers: StrSet, StrIns, StrApp, StrAdd, StrCut To append instead a char* value, see StrAppS
void StrAdd(STRv* dest, int pos, STRv src);
Inserts in one STRv the value of another STRv at a certain position. The position must be between 0 (works like StrIns) and the lenght of the string (works like StrApp). If outside this range, a PosOutOfBounds exception is thrown. The destination STRv buffer is grown if needed. The old value of the STRv is released See STRv modifiers: StrSet, StrIns, StrApp, StrAdd, StrCut To append instead a char* value, see StrAddS
void StrCut(STRv* dest, int pos, int len);
Cuts in one STRv the characters starting at a certain position and for a given lenght. The position must be between 0 and the length of the string. If outside this range, a PosOutOfBounds exception is thrown. The given length can be longer then the length of the string; in that case it is adjusted to the string length The destination STRv buffer keeps its lenght, and must be explicitly trimmed with StrShrink if needed. The old value of the STRv is released See STRv modifiers: StrSet, StrIns, StrApp, StrAdd, StrCut
void StrSubst(STRv* dest, int pos, int len, STRv subst);
Substitutes a range of characters in one STRv in a given position and of a certain lenght with the value of a given STRv.
void StrDebug(STRv s);
Debug function to print the status of a STRv on stdout
void StrUpper(STRv* s);
Converts a STRv into upper case
void StrLower(STRv* s);
Converts a STRv into lower case
/* queries */
BOOL StrEmpty(STRv str);
returns if the string is empty
int StrLen(STRv str);
returns the length of the string value
BOOL StrEqual(STRv s1, STRv s2);
compares 2 STRv's for equal values
int StrCmp(STRv s1, STRv s2);
compares 2 STRv's for lexicographic ordering
int StrHash(STRv val, int max);
calculates a hash value for a STRv.
/* user */
char* StrConst(STRv val);
Defines a STRv value to become a constant to be never released. The returned char* cannot be used for modifying the string, and it is guaranteed to be stable.
char* StrVal(STRv val);
Returns a C string containing a (real) copy of the STRv value The user can do everything with it and has the reponsability of freeing it.
char* StrGet(char* dest, int sz, STRv src);
Writes the STRv value into an allocated C char buffer of a given size. If there is enough space in the buffer, string will be NULL-terminated, else only the portion which fits is copied.
#define Str(val) ((val)->arr)
#define _Str(val) ((val)->arr)
int StrShared(STRv s);
Returns the number of virtual copies of the given STRv (not counting the given copy)
/* c string operations */
STRv StrTemp(char* src);
Creates a temporary STRv from a C string. This STRv does'nt need to be deleted with StrDel, but the user is not entitled to perform modifying operations on this string (for example StrAppS(&StrTemp("bla"),"alb") is unlegal) Basically, the only legal operation is to use StrTemp where the STRv is not modified. The maximum number of active temporary STRv is defined by the macro STRTEMP_NUM
STRv StrCpyS(char* src);
Creates a new STRv from a C string (char*)
void StrSetS(STRv* dest, char* src);
Assignes to a STRv the value of a C string (char*) The old value of the STRv is released
void StrInsS(STRv* dest, char* src);
Inserts at the beginning of a STRv the value of a C string (char*) The old value of the STRv is released
void StrAppS(STRv* dest, char* src);
Appends to a STRv the value of a C string (char*) The old value of the STRv is released
void StrAppN(STRv* dest, char* src, int len);
Appends to a STRv a given number of chars of a C string (char*) if the string is shorter, then the number of chars is the length of the source string The old value of the STRv is released
void StrAddS(STRv* dest, int pos, char* src);
Inserts in a STRv the value of a C string (char*) at a given position The old value of the STRv is released
BOOL StrEqualS(STRv s1, char* s2);
Compares the value of a STRv with the value of a C string (char*)
int StrCmpS(STRv s1, char* s2);
Compares the value of a STRv with the value of a C string (char*) for lexicographic ordering
int StrHashS (char* val, int dim);
calculates a hash value for a C string
STRv StrNCpyS(char* s, int len);
returns a STRv containing the first 'len' characters of a C string (char*)
void StrNSetS(STRv* dest, char* s, int len);
Assignes to a STRv the first 'len' characters of a C string (char*) The old value of the STRv is released
/* low-level */
STRv StrBufNew(int dim);
void StrBufChange(STRv* val, int beg, int end);
/* conversions */
STRv StrFromInt(int i);
Creates a STRv from an int value
int StrToInt(STRv s);
Converts a STRv to an integer The function uses atoi for the conversion
STRv StrPtr(void* p); /* obsolete: not in *.c */
void* PtrStr(STRv s); /* obsolete: not in *.c */
/* functions similar to Buff... */
void StrCutLF (STRv *s);
Removes a line feed at the end of the string.
STRv StrPrintf (STRv* str, char *format, ...);
formatted print into a STRv
STRv StrEncode (STRv *str);
Changes all non-printable characters into printable excape expressions, in C format (like \n -> "\n") See also: StrDecode, StrTranslate
void StrDecode (STRv* str);
Converts C format escape expressions into ASCII codes (like "\n" -> \n) See also: StrEncode, StrTranslate
void StrTranslate(STRv* s, char* from, char* to);
Changes all characters of the string which are contained in the 'from' character set into the corresponding character in the 'to' charcter set. The 2 character sets must have the same lenght See also: StrEncode, StrDecode
void StrFill(STRv* dest, char c, Int4 len);
Appends a certain number of fill characters to a STRv.
void StrTrim (STRv *s, char *skipSet);
Removes leading and trailing spaces or, if specified, other characters.
STRv StrFormat (STRv str, INT4 lineSize, INT4 firstIdent, INT4 allIdent, char *leftText);
Formats a string so that it fits on a device with fixed line size. Additional arguments control identation and prefix.
STRv StrReplace (STRV *dest, char *from, char *to); /* not in strv.h */
Replaces all occurrences of the "from" substring with the "to" string.
/* for string initialization */
typedef struct { struct { int u; int l; int d; } DontInitalize; char arr[20]; } STRoStatic_20;
typedef struct { struct { int u; int l; int d; } DontInitalize; char arr[100]; } STRoStatic_100;
typedef struct { struct { int u; int l; int d; } DontInitalize; char arr[1000]; } STRoStatic_1000;
void StrInitStatic(STRv list, int size, int num);
Initialises an array of num "STRv"s which already contain character strings. Used apparently in early parts of the Object Manager (blub.c)
#define _StrInitStatic(list,num) StrInitStatic((STRv)(list),sizeof((list)[0]),num)
Front end to StrInitStatic that generates the "size" argument automatically.

Command Line

This is a key area as it is somewhere that is very different in EMBOSS.

AJAX prerelease libraries

Libraries in Other Packages

SRS libraries

For comparison with the Ajax library design, SRS 5.1 has the following:

AceLib

Uses its own memory management (aceHandle to allocate, and aceFree to free) with "AceHandles" to refer to data structures. "AceHandle" is typedef'd as "void*". Probably most useful for anything that needs to link to acedb rather than directly in AJAX. Expect this to become involved in, for example, reading sequences from acedb databases, but perhaps as a separate filter program.

There are also other acedb functions which could be useful and should be explored later.

GNU C Library

Could be interesting. Documentation for release 1.09.1 is available at Sanger. Other C libraries ================= Hanson,D.R. "C Interfaces and Implementations: Techniques for Creating Reusable Software"

Has examples of some interesting ideas and C code to implement them, especially for interesting data structures. Could be an alternative, or a supplement, to the SRS libraries.

C++

Refs:

Constants

Classes

Major changes in 1994.

new
constructors
typeinfo
RTTI
ios
I/O streams - also streambuf, istream, ostream, iomanip, stringstream, sstream. Obsolete: fstream, iostream Deprecated: strstream
string
strings - also wstring
bits
bit patterns - also bitstring for variable length
dynarray
dynamic arrays - also ptrdynarray
complex
complex arithmetic

C++ STL

Containers
bitset, vector, list, deque, queue, priority_queue, stack, set, map
Iterators
Pointers that iterate through containers using begin (first) & end (1 after last) functions

Algorithms

Refs:

String matching

common substrings

array searching

eigenvalues/eigenvectors

curve fitting

topological network element sort

critical path

Spanning tree

Sorting

Data structures