RISCOS.com

www.riscos.com Technical Support:
Acorn C/C++

 

C implementation details


This chapter is split into parts, each of which details certain aspects of Acorn C's implementation of the ANSI C standard.

  • The first part - Implementation details - gives details of those aspects of the compiler which the ANSI standard identifies as implementation-defined, and some other points of interest to programmers. They are grouped by subject; the Implementation limits lists the points required to be documented as set out in appendix A.6 of the standard.
  • The second part - Standard implementation definition - discusses aspects of the compiler which are not defined by the ANSI standard, but are implementation-defined and must be documented.

    Appendix A.6 of the standard X3.159-1989 collects together information about portability issues; section A.6.3 lists those points which are implementation defined, and directs that each implementation shall document its behaviour in each of the areas listed. This part corresponds to appendix A.6.3, answering the points listed in the appendix, under the same headings and in the same order.

  • The third part - Extra features - describes some machine-specific features of the Acorn C compiler: #pragma directives, and special declaration keywords for functions and variables.

Implementation details

Identifiers

Identifiers can be of any length. They are truncated by the compiler to 256 characters, all of which are significant (the standard requires a minimum of 31).

The source character set expected by the compiler is 7 bit ASCII, except that within comments, string literals, and character constants, the full ISO 8859-1 8 bit character set is recognised. At run time, the C library processes the full ISO 8859-1 8 bit character set, except that the default locale is the C locale (see the chapter entitled Standard implementation definition). The ctype functions therefore all return 0 when applied to codes in the range 160-255. By calling setlocale(LC_CTYPE,"ISO8859-1") you can cause the ctype functions such as isupper() and islower() to behave as expected over the full 8 bit Latin alphabet, rather than just over the 7 bit ASCII subset.

Upper and lower case characters are distinct in all identifiers, both internal and external.

In -pcc and -fc modes an identifier may also contain a dollar character.

Data elements

The sizes of data elements are as follows:

Type Size in bits
char 8
short 16
int 32
long 32
float 32
double 64
long double 64 (subject to future change)
all pointers 32

Integers are represented in two's complement form.

Data items of type char are unsigned by default, though they may be explicitly declared as signed char or unsigned char. (In -pcc mode there is no signed keyword, so chars are signed by default and may be declared unsigned if required.) Single-character constants are thus always positive.

Floating point quantities are stored in the IEEE format. In double and long double quantities, the word containing the sign, the exponent and the most significant part of the mantissa is stored at the lower machine address.

Limits: limits.h and float.h

The standard defines two header files, limits.h and float.h, which contain constant declarations describing the ranges of values which can be represented by the arithmetic types. The standard also defines minimum values for many of these constants.

The following table sets out the values in these two headers on the ARM, and a brief description of their significance. See the standard for a full definition of their meanings.

Number of bits in smallest object that is not a bit field (ie a byte):

CHAR_BIT 8

Maximum number of bytes in a multibyte character, for any supported locale:

MB_LEN_MAX 1

Numeric ranges of integer types:

The middle column gives the numerical value of each range's endpoint, while the right hand column gives the bit patterns (in hexadecimal) that would be interpreted as this value in C. When entering constants you must be careful about the size and signed-ness of the quantity. Furthermore, constants are interpreted differently in decimal and hexadecimal/octal. See the ANSI standard or any of the recommended textbooks on the C programming language for more details.

Range End-point Hex representation
CHAR_MAX 255 0xff
CHAR_MIN 0 0x00
SCHAR_MAX 127 0x7f
SCHAR_MIN -128 0x80
UCHAR_MAX 255 0xff
SHRT_MAX 32767 0x7fff
SHRT_MIN -32768 0x8000
USHRT_MAX 65535 0xffff
INT_MAX 2147483647 0x7fffffff
INT_MIN -2147483648 0x80000000
UINT_MAX 4294967295 0xffffffff
LONG_MAX 2147483647 0x7fffffff
LONG_MIN -2147483648 0x80000000
ULONG_MAX 4294967295 0xffffffff

Characteristics of floating point:

FLT_RADIX 2
FLT_ROUNDS 1

The base (radix) of the ARM floating point number representation is 2, and floating point addition rounds to nearest.

Ranges of floating types:

FLT_MAX 3.40282347e+38F
FLT_MIN 1.17549435e-38F
DBL_MAX 1.79769313486231571e+308
DBL_MIN 2.22507385850720138e-308
LDBL_MAX 1.79769313486231571e+308
LDBL_MIN 2.22507385850720138e-308

Ranges of base two exponents:

FLT_MAX_EXP 128
FLT_MIN_EXP (-125)
DBL_MAX_EXP 1024
DBL_MIN_EXP (-1021)
LDBL_MAX_EXP 1024
LDBL_MIN_EXP (-1021)

Ranges of base ten exponents:

FLT_MAX_10_EXP 38
FLT_MIN_10_EXP (-37)
DBL_MAX_10_EXP 308
DBL_MIN_10_EX (-307)
LDBL_MAX_10_EXP 308
LDBL_MIN_10_EXP (-307)

Decimal digits of precision:

FLT_DIG 6
DBL_DIG 15
LDBL_DIG 15

Digits (base two) in mantissa:

FLT_MANT_DIG 24
DBL_MANT_DIG 53
LDBL_MANT_DIG 53

Smallest positive values such that (1.0 + x! = 1.0):

FLT_EPSILON 1.19209290e-7F
DBL_EPSILON 2.2204460492503131e-16
LDBL_EPSILON 2.2204460492503131e-16L

Structured data types

The standard leaves details of the layout of the components of structured data types to each implementation. The following points apply to the Acorn C compiler:

  • Structures are aligned on word boundaries.
  • Structures are arranged with the first-named component at the lowest address.
  • A component with a char type is packed into the next available byte.
  • A component with a short type is aligned to the next even-addressed byte.
  • All other arithmetic type components are word-aligned, as are pointers and ints containing bitfields.
  • The only valid type for bitfields are (signed) int and unsigned int. (In -pcc mode, char, unsigned char, short, unsigned short, long and unsigned long are also accepted.)
  • A bitfield of type int is treated as unsigned by default (signed by default in -pcc mode).
  • A bitfield must be wholly contained within the 32 bits of an int.
  • Bitfields are allocated within words so that the first field specified occupies the lowest addressed bits of the word. (When configured little-endian, lowest addressed means least significant; when configured big-endian, lowest addressed means most significant.

Pointers

The following remarks apply to pointer types:

  • Adjacent bytes have addresses which differ by one.
  • The macro NULL expands to the value 0.
  • Casting between integers and pointers results in no change of representation.
  • The compiler warns of casts between pointers to functions and pointers to data (but not in -pcc mode).
Pointer subtraction

When two pointers are subtracted, the difference is obtained as if by the expression:

((int)a - (int)b) / (int)sizeof(type pointed to)

If the pointers point to objects whose size is no greater than four bytes, word alignment of data ensures that the division will be exact in all cases. For longer types, such as doubles and structures, the division may not be exact unless both pointers are to elements of the same array. Moreover the quotient may be rounded up or down at different times, leading to potential inconsistencies.

Arithmetic operations

The compiler performs all of the 'usual arithmetic conversions' set out in the standard.

The following points apply to operations on the integral types:

  • All signed integer arithmetic uses a two's complement representation.
  • Bitwise operations on signed integral types follow the rules which arise naturally from two's complement representation.
  • Right shifts on signed quantities are arithmetic.
  • Any quantity which specifies the amount of a shift is treated as an unsigned 8 bit value.
  • Any value to be shifted is treated as a 32 bit value.
  • Left shifts of more than 31 give a result of zero.
  • Right shifts of more than 31 give a result of zero from a shift of an unsigned or positive signed value; they yield -1 from a shift of a negative signed value.
  • The remainder on integer division has the same sign as the divisor.
  • If a value of integral type is truncated to a shorter signed integral type, the result is obtained by masking the original value to the length of the destination, and then sign extending.
  • Conversions between integral types never causes an exception to be raised.
  • Integer overflow does not cause an exception to be raised.
  • Integer division by zero causes an exception to be raised.

The following points apply to operations on floating types:

  • When a double or long double is converted to a float, rounding is to the nearest representable value.
  • Conversions from floating to integral types cause exceptions to be raised only if the value cannot be represented in a long int (or unsigned long int in the case of conversion to an unsigned int).
  • Floating point underflow is not detected; any operation which underflows returns zero.
  • Floating point overflow causes an exception to be raised.
  • Floating point divide by zero causes an exception to be raised.

Expression evaluation

The compiler performs the 'usual arithmetic conversions' (promotions) set out in the standard before evaluating any expression.

  • The compiler may re-order expressions involving only associative and commutative operators of equal precedence, even in the presence of parentheses (e.g. a + (b - c) may be evaluated as (a + b) - c).
  • Between sequence points, the compiler may evaluate expressions in any order, regardless of parentheses. Thus the side effects of expressions between sequence points may occur in any order.
  • Similarly, the compiler may evaluate function arguments in any order.
  • Any detail of order of evaluation not prescribed by the standard may vary between releases of the Acorn C compiler.

Implementation limits

The standard sets out certain minimum translation limits which a conforming compiler must cope with; you should be aware of these if you are porting applications to other compilers. A summary is given here. The 'mem' limit indicates that no limit is imposed other than that of available memory.

Description Requirement Acorn C
Nesting levels of compound statements and iteration/selection control structures 15 mem
Nesting levels of conditional compilation 8 mem
Declarators modifying a basic type 31 mem
Expressions nested by parentheses 32 mem
Significant characters    
in internal identifiers and macro names 31 256
in external identifiers 6 256
External identifiers in one source file 511 mem
Identifiers with block scope in one block 127 mem
Macro identifiers in one source file 1024 mem
Parameters in one function definition/call 31 mem
Parameters in one macro definition/invocation 31 mem
Characters in one logical source line 509 no limit
Characters in a string literal 509 mem
Bytes in a single object 32767 mem
Nesting levels for #included files 8 mem
Case labels in a switch statement 257 mem
Members in a single struct or union, enumeration constants in a single enum 127 mem
Nesting of struct/union in a single declaration 15 mem

Standard implementation definition

Translation (A.6.3.1)

Diagnostic messages produced by the compiler are of the form

"source-file", line #: severity: explanation

where severity is one of

  • warning: not a diagnostic in the ANSI sense, but an attempt by the compiler to be helpful to you.
  • error: a violation of the ANSI specification from which the compiler was able to recover by guessing your intentions.
  • serious error: a violation of the ANSI specification from which no recovery was possible because the compiler could not reliably guess what you intended.
  • fatal (for example, 'not enough memory'): not really a diagnostic, but an indication that the compiler's limits have been exceeded or that the compiler has detected a fault in itself.

Environment (A.6.3.2)

The mapping of a command line from the ARM-based environment into arguments to main() is implementation-specific. The shared C library supports the following:

  • The arguments given to main() are the words of the command line (not including I/O redirections, covered below), delimited by white space, except where the white space is contained in double quotes. A white space character is any character of which isspace is true. (Note that the RISC OS Command Line Interpreter filters out some of these).

A double quote or backslash character (\) inside double quotes must be preceded by a backslash character. An I/O redirection will not be recognised inside double quotes.

The shared C library supports a pair of interactive devices, both called :tt, that handle the keyboard and the VDU screen:

  • No buffering is done on any stream connected to :tt unless I/O redirection has taken place. If I/O redirection other than to :tt has taken place, full file buffering is used except where both stdout and stderr have been redirected to the same file, in which case line buffering is used.

Using the shared C library, the standard input, output and error streams, stdin, stdout, and stderr can be redirected at runtime in the ways shown below. For example, if mycopy is a compiled and linked program which simply copies the standard input to the standard output, the following line:

*mycopy < infile > outfile 2> errfile

runs the program, redirecting stdin to the file infile, stdout to the file outfile and stderr to the file errfile.

The following shows the allowed redirections:

0< filename read stdin from filename
< filename read stdin from filename
1> filename write stdout to filename
> filename write stdout to filename
2> filename write stderr to filename
2>&1 write stderr to wherever stdout is currently going
>& filename write both stdout and stderr to filename
>> filename append stdout to filename
>>& filename append both stdout and strerr to filename
1>&2 write stdout to whereever stderr is currently going

Identifiers (A.6.3.3)

256 characters are significant in identifiers without external linkage. (Allowed characters are letters, digits, and underscores.)

256 characters are significant in identifiers with external linkage. (Allowed characters are letters, digits, and underscores.)

Case distinctions are significant in identifiers with external linkage.

In -pcc and -fc modes, the character '$' is also valid in identifiers.

Characters (A.6.3.4)

The characters in the source character set are ISO 8859-1 (Latin-1 Alphabet), a superset of the ASCII character set. The printable characters are those in the range 32 to 126 and 160 to 255. Any printable character may appear in a string or character constant, and in a comment.

The compiler has no support for multibyte character sets.

The ARM C library supports the ISO 8859-1 (Latin-1) character set, so the following points hold:

  • The execution character set is identical to the source character set.
  • There are four chars/bytes in an int. If the ARM processor is configured to operate with a little-endian memory system (as in RISC OS), the bytes are ordered from least significant at the lowest address to most significant at the highest address. If the ARM is configured to operate with a big-endian memory system, the bytes are ordered from least significant at the highest address to most significant at the lowest address.
  • A character constant containing more than one character has the type int. Up to four characters of the constant are represented in the integer value. The first character contained in the constant occupies the lowest-addressed byte of the integer value; up to three following characters are placed at ascending addresses. Unused bytes are filled with the NULL (or /0) character.
  • There are eight bits in a character in the execution character set.
  • All integer character constants that contain a single character or character escape sequence are represented in the source and execution character set.
  • Characters of the source character set in string literals and character constants map identically into the execution character set.
  • No locale is used to convert multibyte characters into the corresponding wide characters (codes) for a wide character constant.
  • A plain char is treated as unsigned (but as signed in -pcc mode).
  • Escape codes are:
    Escape sequence Char value Description
    \a 7 Attention (bell)
    \b 8 Backspace
    \f 12 Form feed
    \n 10 Newline
    \r 13 Carriage return
    \t 9 Tab
    \v 11 Vertical tab
    \xnn 0xnn ASCII code in hexadecimal
    \nnn 0nnn ASCII code in octal

Integers (A.6.3.5)

The representations and sets of values of the integral types are set out in the Data elements. Note also that:

  • The result of converting an integer to a shorter signed integer, if the value cannot be represented, is as if the bits in the original value which cannot be represented in the final value are masked out, and the resulting integer sign-extended. The same applies when you convert an unsigned integer to a signed integer of equal length.
  • Bitwise operations on signed integers yield the expected result given two's complement representation. No sign extension takes place.
  • The sign of the remainder on integer division is the same as defined for the function div().
  • Right shift operations on signed integral types are arithmetic.

Floating point (A.6.3.6)

The representations and ranges of values of the floating point types have been given in the Data elements. Note also that:

  • When a floating point number is converted to a shorter floating point one, it is rounded to the nearest representable number.
  • The properties of floating point arithmetic accord with IEEE 754.

Arrays and pointers (A.6.3.7)

The ANSI standard specifies three areas in which the behaviour of arrays and pointers must be documented. The points to note are:

  • The type size_t is defined as unsigned int.
  • Casting pointers to integers and vice versa involves no change of representation. Thus any integer obtained by casting from a pointer will be positive.
  • The type ptrdiff_t is defined as (signed) int.

Registers (A.6.3.8)

In the Acorn C compiler, you can declare any number of objects to have the storage class register. Depending on which variant of the ARM Procedure Call Standard is in use, there are between five and seven registers available. (There are six available in the default APCS variant, as used by RISC OS.) Declaring more than this number of objects with register storage class must result in at least one of them not being held in a register. It is advisable to declare no more than four. The valid types are:

  • any integer type
  • any pointer type
  • any integer-like structure (any one word struct or union in which all addressable fields have the same address, or any one word structure containing only bitfields).

Note that other variables, not declared with the register storage class, may be held in registers for extended periods; and that register variables may be held in memory for some periods.

Note also that there is a #pragma which assigns a file-scope variable to a specified register everywhere within a compilation unit.

Structures, unions, enumerations and bitfields (A.6.3.9)

The Acorn C compiler handles structures in the following way:

  • When a member of a union is accessed using a member of a different type, the resulting value can be predicted from the representation of the original type. No error is given.
  • Structures are aligned on word boundaries. Characters are aligned in bytes, shorts on even numbered byte boundaries and all other types, except bitfields, are aligned on word boundaries. Bitfields are subfields of ints, themselves aligned on word boundaries.
  • A 'plain' bitfield (declared as int) is treated as unsigned int (signed int in -pcc mode).
  • A bitfield which does not fit into the space remaining in the current int is placed in the next int.
  • The order of allocation of bitfields within ints is such that the first field specified occupies the lowest addressed bits of the word.
  • Bitfields do not straddle storage unit (int) boundaries.
  • The integer type chosen to represent the values of an enumeration type is int (signed int).

Qualifiers (A.6.3.10)

An object that has volatile-qualified type is accessed if any word or byte of it is read or written. For volatile-qualified objects, reads and writes occur as directly implied by the source code, in the order implied by the source code.

The effect of accessing a volatile-qualified short is undefined.

Declarators (A.6.3.11)

The number of declarators that may modify an arithmetic, structure or union type is limited only by available memory.

Statements (A.6.3.12)

The number of case values in a switch statement is limited only by memory.

Preprocessing directives (A.6.3.13)

A single-character constant in a preprocessor directive cannot have a negative value.

The standard header files are contained within the compiler itself. The mechanism for translating the standard suffix notation to an Acorn filename is described in the CC and C++.

Quoted names for includable source files are supported. The rules for directory searching are given in the CC and C++.

The recognized #pragma directives and their meaning are described in the #pragma directives.

The date and time of translation are always available, so __DATE__ and __TIME__ always give respectively the date and time.

Library functions (A.6.3.14)

The C library has or supports the following features:

  • The macro NULL expands to the integer constant 0.
  • If a program redefines a reserved external identifier, then an error may occur when the program is linked with the standard libraries. If it is not linked with standard libraries, then no error will be detected.
  • The assert() function prints the following message:

*** assertion failed: expression, file filename, line, line-number

and then calls the function abort().

  • The functions isalnum(), isalpha(), iscntrl(), islower(), isprint(), isupper() and ispunct()usually test only for characters whose values are in the range 0 to 127 (inclusive). Characters with values greater than 127 return a result of 0 for all of these functions, except iscntrl() which returns non-zero for 0 to 31, and 128 to 255.

After the call setlocale(LC_CTYPE,"ISO8859-1") the following statements also apply to character codes and affect the results returned by the ctype functions:

  • codes 128 to 159 are control characters
  • codes 192 to 223 except 215 are upper case
  • codes 224 to 255 except 247 are lower case
  • codes 160 to 191, 215 and 247 are punctuation

The mathematical functions return the following values on domain errors:

Function Condition Returned value
log(x) x <= 0 -HUGE_VAL
log10(x) x <= 0 -HUGE_VAL
sqrt(x) x < 0 -HUGE_VAL
atan2(x,y) x = y = 0 -HUGE_VAL
asin(x) abs(x) > 1 -HUGE_VAL
acos(x) abs(x) > 1 -HUGE_VAL

Where -HUGE_VAL is written above, a number is returned which is defined in the header h.math. Consult the errno variable for the error number.

The mathematical functions set errno to ERANGE on underflow range errors.

A domain error occurs if the second argument of fmod is zero, and
-HUGE_VAL returned.

The set of signals for the generic signal() function is as follows:

SIGABRT Abort
SIGFPE Arithmetic exception
SIGILL Illegal instruction
SIGINT Attention request from user
SIGSEGV Bad memory access
SIGTERM Termination request
SIGSTAK Stack overflow

The default handling of all recognised signals is to print a diagnostic message and call exit. This default behaviour applies at program start-up.

When a signal occurs, if func points to a function, the equivalent of signal(sig, SIG_DFL); is first executed.

If the SIGILL signal is received by a handler specified to the signal function, the default handling is reset.

The C library also has the following characteristics relating to I/O:

  • The last line of a text stream does not require a terminating newline character.
  • Space characters written out to a text stream immediately before a newline character do appear when read back in.
  • No null characters are appended to a binary output stream.
  • The file position indicator of an append mode stream is initially placed at the end of the file.
  • A write to a text stream does not cause the associated file to be truncated beyond that point.
  • The characteristics of file buffering are as intended by section 4.9.3 of the standard.
  • A zero-length file (on which no characters have been written by an output stream) does exist.
  • The validity of filenames is defined by the host computer's filing system.
  • The same file can be opened many times for reading, but only once for writing or updating. A file cannot however be open for reading on one stream and for writing or updating on another.

Note also the following points about library functions:

remove() Cannot remove an open file.
rename() The effect of calling the rename() function when the new name already exists is dependent on the host filing system. Not all renames are valid: examples of invalid renames include

("net:file1","net:$.file2") and ("net:file1","adfs:file2").

fprintf() Prints %p arguments in hexadecimal format (lower case) as if a precision of 8 had been specified. If the variant form (%#p) is selected, the number is preceded by the character @.
fscanf() Treats %p arguments identically to %x arguments.

Always treats the character - in a %[ argument as a literal character.

ftell() and fgetpos()

Set errno to the value of EDOM on failure.

perror()

Generates the following messages:

Error Message
0 No error (errno = 0)
EDOM EDOM - function argument out of range
ERANGE ERANGE - function result not representable
ESIGNUM ESIGNUM - illegal signal number to signal() or raise()
others Error code number has no associated message
calloc(), malloc() and realloc() If the size of the area requested is zero, NULL is returned under RISC OS 3.10, and non-NULL is returned under RISC OS 3.50..
abort() Closes all open files, and deletes all temporary files.
exit() The status returned by exit is the same value that was passed to it. For a definition of EXIT_SUCCESS and EXIT_FAILURE refer to the header file stdlib.h.
getenv() Returns the value of the named RISC OS Environmental variable, or NULL if the variable had no value. For example:

root = getenv ("C$libroot");
if (root == NULL) root = "$.arm.clib";

system() Used either to CHAIN to another application or built-in command or to CALL one as a sub-program. When a program is chained, all trace of the original program is removed from memory and the chained program invoked. If a program is called (which is the default if no CHAIN: or CALL: precedes the program name - a change from Release 2), the calling program and data are moved in memory to somewhere safe and the callee loaded and started up. The return value from the system() call is -2 (indicating a failure to invoke the program) or the value of Sys$ReturnCode set by the called program (0 indicates success).
strerror() The error messages given by this function are identical to those given by the perror() function.
clock() Returns the time taken by the program since its invocation, as indicated by the host's operating system.

Extra features

This section describes the following machine-specific features of the Acorn C compiler:

  • #pragma directives
  • special declaration keywords for functions and variables.

#pragma directives

Pragmas recognised by the compiler come in two forms:

#pragma -letter«digit»

and

#pragma «no_»feature-name

A short-form pragma given without a digit resets that pragma to its default state; otherwise to the state specified.

For example:

#pragma -s1
#pragma no_check_stack

#pragma -p2
#pragma profile_statements

The set of pragmas recognised by the compiler, together with their default settings, varies from release to release of the compiler. The current list of recognised pragmas is:

Pragma name Short form Short 'No' form Command line option
warn_implicit_fn_decls a1 * a0 -Wf
check_memory_accesses c1 c0 * -zpc0|1
warn_deprecated d1 * d0 -Wd
continue_after_hash_error e1 e0 *
(FP register variable) f1-f4 f0 *
include_only_once i1 i0 *
optimise_crossjump j1 * j0 -zpj0|1
optimise_multiple_loads m1 * m0 -zpm0|1
profile p1 p0 * -p
profile_statements p2 p0 * -px
(integer register variable) r1-r7 r0 *
check_stack s0 * s1 -zps0|1
force_top_level t1 t0 *
check_printf_formats v1 v0 *
check_scanf_formats v2 v0 *
side_effects y0 * y1
optimise_cse z1 * z0 -zpz0|1

In each case, the default setting is starred.

You can also globally set pragmas by options set in the command line passed to the cc program (see the chapter entitled Command lines); the preferred option to use is shown above. Where no option is shown for a pragma, it is because that pragma may only sensibly be used locally, and should be enabled/disabled around the particular program statements it is to affect.

Pragmas controlling the preprocessor

The pragma continue_after_hash_error in effect implements a #warning ... preprocessor directive. Pragma include_only_once asserts that the containing #include file is to be included only once, and that if its name recurs in a subsequent #include directive then the directive is to be ignored.

The pragma force_top_level asserts that the containing #include file should only be included at the top level of a file. A syntax error will result if the file is included, say, within the body of a function.

Pragmas controlling printf/scanf argument checking

The pragmas check_printf_formats and check_scanf_formats control whether the actual arguments to printf and scanf, respectively, are type-checked against the format designators in a literal format string.

Of course, calls using non-literal format strings cannot be checked. By default, all calls involving literal format strings are checked.

Pragmas controlling optimisation

The pragmas optimise_crossjump, optimise_multiple_loads and optimise_cse give fine control over where these optimisations are applied. For example, it is sometimes advantageous to disable cross-jumping (the common tail optimisation) in the critical loop of an interpreter; and it may be helpful in a timing loop to disable common subexpression elimination and the opportunistic optimisation of multiple load instructions to load multiples. Note that the correct use of the volatile qualifier should remove most of the more obvious needs for this degree of control (and volatile is also available in the Acorn C compiler's -pcc mode unless -strict is specified).

By default, functions are assumed to be impure, so function invocations are not candidates for common subexpression elimination. Pragma noside_effects asserts that the following function declarations (until the next #pragma side_effects) describe pure functions, invocations of which can be common subexpressions. See also the __pure.

Pragmas controlling code generation
Stack limit checking

The pragma no_check_stack disables the generation of code at function entry which checks for stack limit violation. In reality there is little advantage to turning off this check: it typically costs only two instructions and two machine cycles per function call. The one circumstance in which no_check_stack must be used is in writing a signal handler for the SIGSTAK event. When this occurs, stack overflow has already been detected, so checking for it again in the handler would result in a fatal circular recursion.

Memory access checking

The pragma check_memory_accesses instructs the compiler to precede each access to memory by a call to the appropriate one of:

__rt_rdnchk where n is 1, 2, or 4, for byte, short, or long reads (respectively)
__rt_wrnchk where n is 1, 2, or 4, for byte, short, or long writes (respectively).

Global (program-wide) register variables

The pragmas f0-f4 and r0-r7 have no long form counterparts. Each introduces or terminates a list of extern, file-scope variable declarations. Each such declaration declares a name for the same register variable. For example:

#pragma r1             /* 1st global register */
extern int *sp;
#pragma r2             /* 2nd global register */
extern int *fp, *ap;   /* Synonyms */
#pragma r0             /* End of global declaration */
#pragma f1             /* 1st global FP register */
extern double pi;
#pragma f0             /* End of global declaration */

Any type that can be allocated to a register (see the chapter entitled Registers (A.6.3.8)) can be allocated to a global register. Similarly, any floating point type can be allocated to a floating point register variable.

Global register r1 is the same as register v1 in the ARM Procedure Call Standard (APCS); similarly, r2 equates to v2, and so on. Depending on the APCS variant, between five and seven integer registers (v1-v7, machine registers R4-R10) and four floating point registers (F4-F7) are available as register variables. (There are six integer registers available in the default APCS variant, as used by RISC OS.) In practice it is probably unwise to use more than three global integer register variables and 2 global floating point register variables.

Provided the same declarations are made in each compilation unit, a global register variable may exist program-wide.

Otherwise, because a global register variable maps to a callee-saved register, its value will be saved and restored across a call to a function in a compilation unit which does not use it as a global register variable, such as a library function.

A corollary of the safety of direct calls out of a global-register-using compilation unit, is that calls back into it are dangerous. In particular, a global-register-using function called from a compilation unit which uses that register as a compiler allocated register, will probably read the wrong values from its supposed global register variables.

Currently, there is no link-time check that direct calls are sensible. And even if there were, indirect calls via function arguments pose a hazard which is harder to detect. This facility must be used with care. Preferably, the declaration of global register variable should be made in each compilation unit of the program. See also the __global_reg(n).

Special function declaration keywords

Several special function declaration options are available to tell the Acorn C compiler to treat that function in a special way. None of these are portable to other machines.

__value_in_regs

This allows the compiler to return a structure in registers rather than returning a pointer to the structure. For example:

typedef struct int64_struct {
  unsigned int lo;
  unsigned int hi;
} int64;

__value_in_regs extern int64 mul64(unsigned a, unsigned b);

See the chapter entitled ARM procedure call standard of the Desktop Tools guide for details of the default way in which structures are passed and returned.

__pure

By default, functions are assumed to be impure (i.e. they have side effects), so function invocations are not candidates for common subexpression elimination. __pure has the same effect as pragma noside_effects, and asserts that the function declared is a pure function, invocations of which can be common subexpressions.

Special variable declaration keywords

__global_reg(n)

Allocates the declared variable to a global integer register variable, in the same way as #pragma rn. The variable must have an integral or pointer type. See also the Global (program-wide) register variables.

__global_freg(n)

Allocates the declared variable to a global floating point register variable, in the same way as #pragma fn. The variable must have type float or double. See also the Global (program-wide) register variables.

Note that the global register, whether specified by keyword or pragmas, must be declared in all declarations of the same variable. Thus:

int x;
__global_reg(1) x;

is an error.

© 3QD Developments Ltd 2013