RISCOS.com

www.riscos.com Technical Support:
Acorn C/C++

 

Assembly language interface


Interworking assembly language and C - writing programs with both assembly language and C parts - requires use both of ObjAsm and of CC and/or C++. Further explanation of examples is provided in the Interworking assembler with C of the Acorn Assembler guide.

Interworking assembly language and C can be very useful for construction of top quality RISC OS applications. Using this technique you can take advantage of many of the strong points of both languages. Writing most of the bulk of your application in C allows you to take advantage of the portability of C, the maintainability of a high level language and the power of the C libraries and language. Writing critical portions of code in assembler allows you to take advantage of all the speed of the Archimedes and all the features of the machine (eg use the complete floating-point instruction set).

The key to interworking C and assembler is writing assembly language procedures that obey the ARM Procedure Call Standard (APCS). This is a contract between two procedures, one calling the other. The called procedure needs to know which ARM and floating-point registers it can freely change without restoring them before returning, and the caller needs to know which registers it can rely on not being corrupted over a procedure call.

Additionally, both procedures need to know which registers contain input arguments and return arguments, and the arrangement of the stack has to follow a pattern that debuggers and so on can understand. For the specification of the APCS, see the chapter entitled ARM procedure call standard of the Desktop Tools guide.

This chapter explains how C uses the APCS, in terms of the appearance of assembly language optionally output by CC and the way the stack set up by the C run-time library works.

Register names

The following names are used in referring to ARM registers:

a1 R0 Argument 1, also integer result, temporary
a2 R1 Argument 2, temporary
a3 R2 Argument 3, temporary
a4 R3 Argument 4, temporary
v1 R4 Register variable
v2 R5 Register variable
v3 R6 Register variable
v4 R7 Register variable
v5 R8 Register variable
v6 R9 Register variable
sl R10 Stack limit
fp R11 Frame pointer
ip R12 Temporary work register
sp R13 Lower end of current stack frame
lr R14 Link address on calls, or workspace
pc R15 Program counter and processor status
f0 F0 Floating point result
f1 F1 Floating-point work register
f2 F2 Floating-point work register
f3 F3 Floating-point work register
f4 F4 Floating-point register variable (must be preserved)
f5 F5 Floating-point register variable (must be preserved)
f6 F6 Floating-point register variable (must be preserved)
f7 F7 Floating-point register variable (must be preserved)

In this section, 'at [r]' means at the location pointed to by the value in register r; 'at [r,#n]' refers to the location pointed to by r+n. This accords with ObjAsm's syntax.

Register usage

The following points should be noted about the contents of registers across function calls.

  • Calling a function (potentially) corrupts the argument registers a1 to a4, ip, lr, and f0-f3. The calling function should save the contents of any of these registers it may need.
  • Register lr is used at the time of a function call to pass the return link to the called function; it is not necessarily preserved during or by the function call.
  • The stack pointer sp is not altered across the function call itself, though it may be adjusted in the course of pushing arguments inside a function. The limit register sl may change at any time, but should always represent a valid limit to the downward growth of sp. User code will not normally alter this register.
  • Registers v1 to v6, and the frame pointer fp, are expected to be preserved across function calls. The called procedure is responsible for saving and restoring the contents of any of these registers which it may need to use.

Control arrival

At a procedure call, the convention is that the registers are used as follows:

  • a1 to a4 contain the first four arguments. If there are fewer than four arguments, just as many of a1 to a4 as are needed are used.
  • If there are more than four arguments, sp points to the fifth argument; any further arguments will be located in succeeding words above [sp].
  • fp points to a backtrace structure.
  • sp and sl define a temporary workspace of at least 256 bytes available to the procedure.
  • sl contains a stack chunk handle, which is used by stack handling code to extend the stack in a non-contiguous manner.
  • lr contains the value which should be restored into pc on exit from the called procedure.
  • pc contains the entry address of the called procedure.

Passing arguments

All integral and pointer arguments are passed as 32-bit words. Floating point 'float' arguments are 32-bit values, 'double'-argument 64-bit values. These follow the memory representation of the IEEE single and double precision formats.

Arguments are passed as if by the following sequence of operations:

  • Push each argument onto the stack, last argument first.
  • Pop the first four words (or as many as were pushed, if fewer) of the arguments into registers a1 to a4.
  • Call the function, for example by the branch with link instruction:

    BL functionname

In many cases it is possible to use a simplified sequence with the same effect (eg load three argument words into a1-a3).

If more than four words of arguments are passed, the calling procedure should adjust the stack pointer after the call, incrementing it by four for each argument word which was pushed and not popped.

Return link

On return from a procedure, the registers are set up as follows:

  • fp, sp, sl, v1 to v6 and f4 to f7 have the same values that they contained at the procedure call.
  • Any result other than a floating point or a multi-word structure value is placed in register a1.
  • A floating point result should be placed in register f0.

Structure values returned as function results are discussed below.

Structure results

A C function which returns a multi-word structure result is treated in a slightly different manner from other functions by the compiler. A pointer to the location which should receive the result is added to the argument list as the first argument, so that a declaration such as the following:

s_type afunction(int a, int b, int c)
{
  s_type d;
  /* ... */

  return d; 
}

is in effect converted to this form:

void afunction(s_type *p, int a, int b, int c)
{ 
  s_type d;
  /* ... */
  *p = d;

  return;
}

Any assembler-coded functions returning structure results, or calling such functions, must conform to this convention in order to interface successfully with object code from the C compiler.

Storage of variables

The code produced by the C compiler uses argument values from registers where possible; otherwise they are addressed relative to fp, as illustrated in Examples below.

Local variables, by contrast, are always addressed with positive offsets relative to sp. In code which alters sp, this means that the offset for the same variable will differ from place to place. The reason for this approach is that it permits the stack overflow procedure to recover by changing sp and sl to point to a new stack segment as necessary.

Function workspace

The values of sp and sl passed to a called function define an area of readable, writable memory available to the called function as workspace. All words below [sp] and at or above [sl,#-512] are guaranteed to be available for reading and writing, and the minimum allowed value of sp is sl-256. Thus the minimum workspace available is 256 bytes.

The C run-time system, in particular the stack extension code, requires up to 256 bytes of additional workspace to be left free. Accordingly, all called functions which require no more than 256 bytes of workspace should test that sp does not point to a location below sl, in other words that at least 512 bytes remain. If the value in sp is less than that in sl, the function should call the stack extension function x$stack_overflow. Functions which need more than 256 bytes of workspace should amend the test accordingly, and call x$stack_overflow1, as described below. The following examples illustrate a method of performing this test.

Note that these are the C-specific aliases for the kernel functions _kernel_stkovf_split_0frame and _kernel_stkovf_split_frame respectively, described in the chapter The shared C library in the RISC OS 3 Programmer's Reference Manual.

Examples

The following fragments of assembler code illustrate the main points to consider in interfacing with the C compiler. If you want to examine the code produced by the compiler in more detail for particular cases, you can request an assembler listing by enabling the Assembler option on the CC SetUp menu.

This is a function gggg which expects two integer arguments and uses only one register variable, v1. It calls another function ffff.

        AREA    |C$$code|, CODE, READONLY
        IMPORT  |ffff|
        IMPORT  |x$stack_overflow|
        EXPORT  |gggg|
gggx    DCB     "gggg", 0                         ; name of function, 0 terminated
        ALIGN                                     ; padded to word boundary
gggy    DCD     &ff000000 + gggy - gggx           ; dist. to start of name

        ; Function entry: save necessary regs. and args. on stack 
gggg    MOV     ip, sp
        STMFD   sp!, {a1, a2, v1, fp, ip, lr, pc}
        SUB     fp, ip, #4                        ; points to saved pc

        ; Test workspace size
        CMPS    sp, sl
        BLLT    |x$stack_overflow|

        ; Main activity of function
        ; .... 
        ADD     v1, v1, #1                        ; use a register variable
        BL      |ffff|                            ; call another function
        CMP     v1, #99                           ; rely on reg. var. after call 

        ; .... 
        ; Return: place result in a1, and restore saved registers
        MOV     a1, result
        LDMEA   fp, {v1, fp, sp, pc}^

If a function will need more than 256 bytes of workspace, it should replace the two-instruction workspace test shown above with the following:

        SUB     ip, sp, #n
        CMP     ip, sl
        BLLT    |x$stack_overflow1|

where n is the number of bytes needed. Note that x$stack_overflow1 must be called if more than 256 bytes of frame are needed. ip must contain sp_needed, as shown in the example above.

A function which expects a variable number of arguments should store its arguments in the following manner, so that the whole list of arguments is addressable as a contiguous array of values:

        MOV   ip, sp                        ; copy value of sp
        STMFD sp!, {a1, a2, a3, a4}         ; save 4 words of args.
        STMFD sp!, {v1, v2, fp, ip, lr, pc}
        ;save v1-v6 needed
        SUB   fp, ip, #20                   ; fp points to saved pc
        CMPS  sp, sl                        ; test workspace
        BLLT  |x$stack_overflow|

Some complete program examples are described in the Interworking assembler with C of the Acorn Assembler guide.

© 3QD Developments Ltd 2013