This is an extremely lengthy article, and has therefore been broken into a number of 'chapters'. Clicking the following links will scroll the page down to the corresponding heading, whereas clicking on the heading itself will return the view to the top of the page.
You may be wondering, "What is assembly language and why bother to learn it?"
Assembly language is a convenient way of writing machine code: the actual instructions used directly by the ARM or StrongARM processor in your computer. A program written in assembly language is the most efficient type of program; it should run faster than code produced by other methods, and take up less memory.
It's also the most laborious to write and debug, though, so why bother? It may be that you want to move on from Basic but can't be bothered to learn C or C++, the other common programming options. It may be that you want to write something which will run faster than a Basic program (BBC Basic on a RISC OS machine is actually pretty fast, but not as fast as machine code), or you may wish to write a module (which certainly can't be written in Basic). Alternatively, you may not want someone else to be able to dissect your programming, which they can do fairly easily with a Basic program.
You've nothing to lose by having a dabble, so read on.
You can load a machine code file into a text editor of the type which can show you a hexadecimal listing of its contents or, alternatively, you could start up Basic and load the file into spare memory with the *Load command, then look at it with the *Memory command. Either way, you'll see something like this:
Address : 7 6 5 4 B A 9 8 F E D C 3 2 1 0 : ASCII Data 00009054 : E92D4000 EB000011 68BD8000 E3A00006 : .@-....h.. 00009064 : E3A03C01 EF02001E 68BD8000 E58C2000 : .<....h. 00009074 : E1A0C002 EA000031 E92D4000 EB000007 : .1...@-... 00009084 : 68BD8000 E3A00006 E3A03C01 EF02001E : .h...<... 00009094 : 68BD8000 E58C2000 E1A0C002 EA000027 : .h. .'.. 000090A4 : E24F20AC E2422000 E28F3060 E2833000 : O. B`0.0 000090B4 : E59F4084 E4920004 E4931004 E0211004 : @......! 000090C4 : E1500001 128F0010 139EF201 E28F0068 : ..P.....h. 000090D4 : E1500003 1AFFFFF6 E1A0F00E 00000000 : ..P.....
Everything in this listing is in hexadecimal notation. Each line shows the contents of 16 bytes of memory, split into groups of four bytes or 32 bits. Each number in the left-hand column is the address of the first byte in the first group on the line, and the numbers along the top line are the lowest digits of the addresses of the bytes below. Each 32-bit group, incidentally, is called a word. It's important to understand that every machine code instruction consists of one word and that the address of the first byte of the word must be divisible by four. The four bytes are then said to be word-aligned.
You'll no doubt agree that this listing is pretty well incomprehensible. The processor knows what the contents of the words mean, but to us poor humans it conveys nothing. Even the right-hand section, which shows the result of treating the individual bytes as ASCII code, tells us nothing.
We need a simpler way of looking at the code; one which will tell us what each instruction means.
00009054 : .@- : E92D4000 : STMDB R13!,{r14 }
00009058 : ... : EB000011 : BL &000090A4
0000905C : .h : 68BD8000 : LDMVSIA R13!,{pc }
00009060 : .. : E3A00006 : MOV R0,#6
00009064 : .< : E3A03C01 : MOV R3,#&0100
00009068 : ... : EF02001E : SWI XOS_Module
0000906C : .h : 68BD8000 : LDMVSIA R13!,{pc }
You can see from the addresses on the left-hand side that each line now contains just one four-byte word, corresponding to one machine code instruction. The second column shows what the four bytes would represent if they were ASCII code and the third column shows the bytes themselves: still as incomprehensible as before.
When we look at the fourth and fifth columns, though, things start to get a little clearer. Each line in the fourth column contains a mnemonic (a brief description of what the instruction does), followed in the fifth column by which numbers or which of the processor's registers are involved.
Clearly, the simplest way to write machine code is by writing the mnemonics and getting them converted. The software which produced the above listing is called a disassembler. A package which does the job in reverse, turning mnemonics into machine code, is called an assembler, and a program consisting of mnemonics is said to be written in assembly language.
The equivalent to the REM in assembly language is the comment. A comment starts with a semi-colon (;) and, like a REM statement, can either occupy a line on its own or be added onto the end of an instruction.
It's a good idea to put plenty of comments in your assembly language programs for two reasons. Firstly, assembly language is a lot more inscrutable than Basic, and it may be difficult to comprehend the workings of a program that you wrote several months ago without them. Secondly, they do not inflate the 'final product' of your programming, which is the machine code itself. You can put in as many comments as you like, but none of them will appear in the assembled code.
Within the ARM and StrongARM processors there are sixteen number-stores called registers, each of which can hold a 32-bit number. These are referred to as R0, R1, R2 etc. through to R15. In fact, there are more than sixteen, because the processor actually has four modes of operation and some of the higher number registers are replaced by alternative ones when the processor is switched into a different mode, but that need not concern us in this article.
Register R15 is the program counter (PC). This always holds the address of the next instruction to be read from memory. Usually, this is increased by four (to point to the next word) each time an instruction is read. Sometime, though, the instruction may be to branch, i.e. jump, to a different address. When this happens, the number in R15 is replaced by the address where the program jumps to.
Register R14 is called the link register. Sometimes, we may wish to call a subroutine, a bit like calling a procedure in Basic. When the subroutine has finished, we will want the program to jump back to the point at which the subroutine was called and continue from there.
This is achieved with a branch linking instruction; you can see one at address &9058 in the listing above, with the mnemonic BL. Before the jump to the subroutine, the number in the program counter (the return address) is copied into the link register. To return just involves copying the link register back into the program counter.
The other higher-numbered registers do not have specific functions within the processor but are often given special jobs by the software. Register R13, for example, is normally used as the stack pointer, containing an address within a section of memory used for temporary storage. Most of the number-crunching is done by the lower-numbered registers, R0 to R6.
Some of the assembly language instructions are concerned with copying numbers from one register to another, or loading a number into a register. The instruction at &9060, for example, moves the number 6 into register R0.
10 REM > Assem01
20 REM simple assembly language program
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 12
50 P%=code%
60 [OPT 3
70 MOV R0,#7
80 SWI "OS_WriteC"
90 MOV PC,R14
100 ]
The assembler is turned on and off by the square brackets, [ and ], in lines 60 and 100, and the assembly language instructions are put between them. Before we get to that bit, line 40 sets up a block of memory to hold the machine code instructions (12 bytes are enough for the three instructions in this simple program) and line 50 sets up the resident integer variable P%. This variable is used by the assembler to represent the program counter and determines (in this case) where in memory each instruction is put.
The OPT instruction in line 60 controls the way the assembler operates. More about that later; for now, leave it set to OPT 3.
We'll look at the rest of the program later. In the meantime, try running it. Observe what is on the screen, then press space or click the mouse to get rid of it and get back to here.
You should have seen a command window showing something like the following:
00008FB8 OPT 3 00008FB8 E3A00007 MOV R0,#7 00008FBC EF000000 SWI "OS_WriteC" 00008FC0 E1A0F00E MOV PC,R14
The program started up the assembler and assembled each instruction at the address pointed to by P%, putting what it was doing on the screen each time. The first instruction was placed at address &8FB8, which happened to be the value of code% and the start of the memory block set up by the DIM instruction. After each address comes the hexadecimal machine code instruction followed by the original mnemonic which created it.
After each instruction has been assembled, P% is incremented by four so that the next instruction is assembled four bytes further on. Don't worry if your address numbers were different; it's not important, but note that the addresses are always word-aligned; they all end in 0, 4, 8 or C because they're divisible by four.
All that this program does is assemble a bit of machine code; it doesn't run it. To do that, prepare to run file Assem02, which is identical to Assem01 except that it has two extra lines on the end:
110 REPEAT UNTIL GET
120 CALL code%
This time, when you run the program, you should see the same assembled listing as before, though with different address numbers because the Basic program is longer. The cursor will be flashing away underneath. The program has paused at the loop in line 110. When you press a key it will move on to line 120, which is an instruction to run the machine code, starting at address code%.
Now run the file and see what happens; but first, a word of warning. Machine code does not have the error-trapping capabilities of Basic. If there is an error in your assembly language program (as there is bound to be at some point), it is highly likely that your computer will crash and have to be reset. If you're following an on-screen guide such as this, make sure that you can get back to where you were in it. Also make sure that you don't have any unsaved work on your desktop.
You should have found that, when you pressed a key, the machine beeped, then asked you to press Space or click the mouse to return to the desktop. It's time to examine the three instructions to find out what they're doing:
MOV R0,#7
SWI "OS_WriteC"
MOV PC,R14
The first instruction, the MOV command, is one of the simplest of all assembly language instructions: a command to move something into a register. If the instruction had been:
MOV R0,R1There is a limitation to the number that can be moved into a register in this way. The machine code instruction consists of 32 bits, and only eight of them are available to hold the number, so the number itself can only have eight bits. The instruction
MOV R0,#&FFMOV R0,#&1FFThe second instruction is, of course, a software interrupt (SWI), used to call the operating system. In Basic, a SWI may be called using the SYS command. As you may know, the command:
SYS "OS_Splodge", a%, b%, c% TO x%, y%, z%The first two instructions are the equivalent of the Basic command:
SYS "OS_WriteC", 7If you were to copy Assem02 from the CD onto your hard disc so that you could modify it and change the number after the hash from 7 to 65, you should find that, instead of beeping, a letter 'A' appears on the screen. The number 65 is, of course, the ASCII code for A, and the two instructions are now the equivalent of VDU 65.
When Basic executed the CALL command, it treated your program as a subroutine, getting to it with a Branch Linking or BL instruction. This caused the address in the program counter (i.e. the first instruction to be executed after your program had finished) to be placed into R14 (this is often known as the return address). To get back to this address, we simply have to copy the number in R14 back into R15 which, you will recall, is the program counter.
The assembler recognises "PC" as referring to R15, so we use it to remind ourselves that we are talking about the program counter. The instruction MOV PC,R14 simply copies the contents of R14 into the program counter, which is all that is required to pass control back to Basic.
Take a look at file Assem03:
10 REM > Assem03
20 REM simple assembly language program
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 24
50 P%=code%
60 [OPT 3
70 MOV R0,#ASC("A")
80 .loop
90 SWI "OS_WriteC"
100 ADD R0,R0,#1
110 CMP R0,#ASC("Z")+1
120 BNE loop
130 MOV PC,R14
140 ]
150 REPEAT UNTIL GET
160 CALL code%
This program works in the same way as the previous one, but we've added some more instructions. The memory block created by the DIM command in line 40 has been enlarged to 24 bytes for this reason. In fact, there's no harm in creating a large block, perhaps of several thousand bytes, while you're experimenting, provided you have the RAM to spare; it can mean that you avoid the risk of running out.
We saw how the previous program could be modified by changing the immediate constant in the MOV instruction from 7 to 65 so that it printed a letter A. Line 70 does the same thing, but the simple number '65' has been replaced by ASC("A"), which means 'the ASCII code for A', to make the listing more readable. The result is just the same, but it makes it easier to follow what the program is doing.
The word loop with a dot in front of it in line 80 is a label. This is a way of marking the point in the program where something occurs so that we could refer to it at some other place. We might use a label in one of two ways:
In this program, the label is being used as part of a repeated loop, to mark the point where the program jumps back to.
A label with a dot may either occupy a line on its own, as in this case, or be placed in front of an instruction, separated from it by a space.
The label is, in fact, a Basic variable which is created and given the current value of the program counter, P%. Putting a dot in front of it is similar to typing:
loop=P%Line 90 operates in the same way as in the previous program, calling the SWI "OS_WriteC" to print a letter A on the screen.
Line 100 introduces a new instruction, ADD, which, not surprisingly, adds two numbers together. It has to be followed by three parameters, referred to as the destination, operand one and operand two, such as:
ADD R0,R1,R2The destination (R0) and operand one (R1) must be registers. Operand two could be either a register or an immediate constant.
The actual instruction in line 100 has an immediate constant for operand two and the register where the answer is stored is the same as the one where the other number is taken from. There is nothing wrong with this. The instruction means, "Take the number in R0, add 1 to it and put the result back into R0," in other words, "increment the number in R0 by 1."
| Negative (N): | Set if operand 2 is greater than operand 1 | |
| Zero (Z): | Set if operand 2 is equal to operand 1 (i.e. the result of the subtraction is zero) | |
| Carry (C): | Set if operand 1 is greater than operand 2, treating them as unsigned numbers | |
| Overflow (V): | Set if a mathematical overflow occurred |
In fact, various mathematical instructions can set these flags, but they only do so if they have the suffix S on the end of their mnemonic. The CMP instruction doesn't need the S suffix because its only purpose is to set the flags.
When the CMP instruction is executed for the first time, operand 1 (R0) is the ASCII code for 'A' (65) and it subtracts 91, a number one greater than the ASCII code for 'Z', from it. The result is clearly negative, so the N flag would be set. Each time round the loop, however, R0 has been increased by 1, and eventually reaches the value 91. When this happens, the result of the comparison becomes zero, the N flag is clear and the Z flag is set.
If we wanted the program to branch every time it reached this point, the instruction would be:
B loopThe other two letters, NE, mean that this instruction is our first example of conditional execution.
Any instruction can be executed conditionally, the condition being determined by the suffix. There are sixteen possibilities:
| EQ | Equal (Z flag set) | |
| NE | Not equal (Z flag clear) | |
| CS | Carry flag set | |
| CC | Carry flag clear | |
| MI | Minus (N flag set) | |
| PL | Positive or zero (N flag clear) | |
| VS | Overflow (V) flag set | |
| VC | Overflow (V) flag clear | |
| HI | Unsigned higher (C flag set and Z flag clear) | |
| LS | Unsigned lower or the same (C flag clear or Z flag set) | |
| GE | Signed higher or the same (N flag the same as the V flag) | |
| LT | Signed lower (N flag not the same as the V flag) | |
| GT | Signed higher (Z flag clear and the N flag the same as the V flag) | |
| LE | Signed lower or the same (Z flag set and the N flag not the same as the V flag) | |
| AL | Always | |
| NV | Never |
Obviously, you never need to use the AL suffix because unconditional execution doesn't need a suffix. The use of NV is also frowned upon because the bit-pattern it sets up might be used for some other condition some day.
The BNE instruction in line 120 means 'Branch if not equal'. As long as the CMP instruction in the previous line is comparing two different numbers, it will keep the Z flag clear. When the value in R0 reaches ASC("Z")+1, i.e. 91, the instruction will be comparing identical numbers, so the Z flag will be set. Under this condition, the branch instruction will not be executed and the program finishes.
We could have achieved the same result with the instruction:
BMI loopYou've probably worked out by now what this program does, even if you haven't run it to have a look. Each time round the loop, a character is printed whose ASCII code is one higher than the one before. It starts at 'A' and ends with 'Z'. In other words, it prints the alphabet.
Take a look at file Assem04. It works in a similar manner to the first program, except that the ASCII code used to call the SWI is loaded from a memory location instead of being moved into a register as an immediate constant.
10 REM > Assem04
20 REM simple assembly language program
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 100
50 P%=code%
60 [OPT 3
70 .data
80 EQUD &41
90 .start
100 ADR R1,data
110 LDR R0,[R1]
120 SWI "OS_WriteC"
130 MOV PC,R14
140 ]
150 REPEAT UNTIL GET
160 CALL start
The code byte is stored at a location pointed to by the label data, and is put there by the EQUD command. This is one of several assembler commands which put data into memory rather than assemble machine code instructions. The full list is:
| EQUB: | stores one byte, | e.g. EQUB &41 | ||
| EQUW: | stores two bytes, | e.g. EQUW &0D0A | ||
| EQUD: | stores four bytes (one word), | e.g. EQUD &56F4D31A | ||
| EQUS: | stores a string, | e.g. EQUS "This is a string" |
A word of explanation here. These terms were originally devised for the 8-bit 6502 assembler built into the Basic interpreter in the BBC Microcomputer in the early 1980s. In those days, the term 'word' was used to mean 16 bits or two bytes; hence the term EQUW for two bytes. The expression EQUD meant 'double word', or four bytes. When the 32-bit ARM processor was developed, it was decided that it would be better for 'word' to mean four bytes, or 32 bits. The old expressions, however, have been retained for compatibility.
Line 80 could have been written as:
80 EQUB &41:ALIGNThe ALIGN command means, "If P% is not divisible by four, then increment it until it is." It is very important to use it if you add something to the memory which doesn't consist of a multiple of four bytes. We might, for example, wish to put in a string, terminated by a zero:
EQUS "This is a string":EQUB 0:ALIGNComing back to the original form of line 80, the expression EQUD &41 is effectively the same as EQUD &00000041. Four-byte numbers are always stored in memory with the least significant byte in the lowest of the four addresses. The byte pointed to by label data will contain &41, and the next three bytes will contain zeros.
The address of the location to be loaded from is contained within the square brackets. In this case, it's pointed to by R1. It's also possible to add an offset (either a second register or an immediate constant) which comes after the register number but is still within the brackets. We'll see how that is used later.
The form of the instruction in line 110 tells the processor to load from the address pointed to by R1, and to put the contents into R0.
We set up R1 in line 100 with an ADR instruction. You may be wondering how, if we can't get a complete 32-bit address into the LDR instruction, we're able to do it with ADR. The answer is that ADR is actually a pseudo-instruction: one which the assembler effectively creates out of another instruction. In this case, it calculates the difference between the required address and the current value of the program counter and sets up an instruction to add or subtract this difference to or from the PC. In other words, the address is stated relative to the PC, not in absolute terms.
When the processor fetches an instruction from memory, it decodes it while it's fetching the next one and executes it while fetching the one after that. What this means is that, while an instruction is being executed, the program counter has already moved on eight bytes to fetch the instruction after next. If the instruction does something relative to the PC, such a branch or ADR, it is doing it relative not to its own address but to an address eight bytes further on. The assembler always takes this into account when setting up such instructions, but it's best to bear it in mind.
LDRB R0,[R1]Note, by the way, that the B suffix goes after any condition code on the instruction. If, for example, the above instruction was only to be executed if the zero flag was clear, its mnemonic would be:
LDRNEB R0,[R1] 60 [OPT 3
70 .start
80 ADR R1,data
90 LDR R0,[R1]
100 SWI "OS_WriteC"
110 MOV PC,R14
120 .data
130 EQUD &41
140 ]
There is certainly nothing wrong with the assembly language instructions here, but you will find if you try to run Assem05 that you get an error message saying, "Unknown or missing variable at line 80".
It's easy to work out what's going wrong. In line 80, the program has to do something with the value of label data which, you will recall, is a Basic variable. In the previous listing, this variable was created in line 70, given the current value of P% and used in line 100. By the time the program reached line 100, it already knew the value of variable data.
In this program, though, the variable data is used in line 80 but not created until the program reaches line 120. How do we get round this problem?
By the time we get to the end of the first pass, we should have met all the labels. We can then go back and assemble the code again, exactly as it was before, except that this time all the references to labels should work (provided, of course, that we don't include a reference to a label that doesn't exist!).
The easiest way to run a piece of code twice in Basic is with a FOR ... NEXT loop, and this is what we do in Assem06:
10 REM > Assem06
20 REM simple assembly language program
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 100
50 FOR pass%=0 TO 3 STEP 3
60 P%=code%
70 [OPT pass%
80 .start
90 ADR R1,data
100 LDR R0,[R1]
110 SWI "OS_WriteC"
120 MOV PC,R14
130 .data
140 EQUD &41
150 ]
160 NEXT
170 REPEAT UNTIL GET
180 CALL start
The individual bits of the value of OPT control different aspects of the assembler:
| Bit 0: | If clear, the assembled listing is not shown on the screen; if set, it is shown. | |
| Bit 1: | If clear, unknown labels are ignored; if set, they cause an error. | |
| Bit 2: | If clear, P% acts as both the program counter and a pointer to where the machine code is assembled. If set, offset assembly is used. P% then acts as the program counter but O% controls where in memory the instruction is placed. Both variables are normally incremented together. | |
| Bit 3: | If set, a range check is applied to ensure that we don't try to assemble more code than will fit into the data block which we created to hold it. We can set L% to the upper limit and assembly will stop if P% (or O%) exceeds it. |
In this program, the FOR ... NEXT loop creates two passes, the first with OPT set to zero and the second with it set to 3. On the first pass, the listing is not shown on the screen (we don't want to see it twice!) and unknown labels are ignored. On the second pass, the listing is shown and any references to non-existent labels cause an error.
Note that P% is set to code% inside the loop, so that it is reset at the start of the second pass. It is important for both passes to start in the same place.
This time, the program will work.
If we didn't want to see the assembled listing on either pass, we could, of course, change line 50 to read:
50 FOR pass%=0 TO 2 STEP 2An Absolute file is loaded into memory starting at &8000 and run from there. We would have to assemble the code in the data block, but its contents would have to be as though it started at &8000.
We can do this using offset assembly, with bit 2 of OPT set on both passes, as in Assem07:
10 REM > Assem07
20 REM simple assembly language program with offset assembly
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 100
50 FOR pass%=4 TO 7 STEP 3
60 P%=&8000:O%=code%
70 [OPT pass%
80 .start
90 ADR R1,data
100 LDR R0,[R1]
110 SWI "OS_WriteC"
120 SWI "OS_Exit"
130 .data
140 EQUD &41
150 ]
160 NEXT
170 REPEAT UNTIL GET
180 OSCLI ("Save MyFile "+STR$~code%+" "+STR$~O%)
190 *SetType MyFile Absolute
This time, we set P% to &8000 and O% to code%. Watch the assembled listing on the screen as you run the program: instead of the numbers on the left-hand side referring to addresses within Basic's variable workspace, they now start at &8000.
Instead of calling the code and displaying the letter A on the screen, the last part of the program saves the code as a file, after you've pressed a key (you may wish to dispense with line 170). The OSCLI command sets up a command line string of the form:
The file will be saved in your currently selected directory and will run if you double-click on it.
There is, incidentally, an important difference between the assembly language in this file and that in the previous one. Because the program is not CALLed from Basic, but run as an absolute file, it doesn't have a return address passed to it in R14, so it can't finish with MOV PC,R14. Instead, it calls SWI OS_Exit, which passes control straight back to the operating system.
If you've finished with the address once you've loaded the data from it, there's nothing wrong with the following:
ADR R0,data
LDR R0,[R0]R0 is first set up to point to the address. The address is then overwritten by the data itself. This gets rid of the extra register but it still takes two instructions, the first of which sets up the address by referring to the program counter.
We can combine the two instructions into one, which looks like this:
LDR R0,dataAn example:
LDR R0,[R1,R2]We might have a label called data which points to the start of eight bytes of data. We want to load the first four-byte word into R0 and the second into R1:
ADR R2,data
LDR R0,[R2]
LDR R1,[R2,#4]This is especially useful if we want to load repeatedly from successive addresses, using a loop.
Look at file Assem08:
70 [OPT pass%
80 ;set up R1 to point to text string
90 ADR R1,string
100 .loop
110 LDRB R0,[R1];load one character
120 ADD R1,R1,#1;increment R1 ready for next character
130 CMP R0,#0;check for terminating zero
140 ;next two instructions executed only if end of string not yet reached
150 SWINE "OS_WriteC"
160 BNE loop
170 MOV PC,R14
180 .string
190 EQUB &0A:EQUS "This is a string":EQUB &0A:EQUB 0:ALIGN
200 ]
From now on, we'll only show the part of the program between the square brackets which turn the assembler on and off, except where necessary. This is because the Basic parts remain the same as before. One change which has been made starting with Assem08, though, is that the REPEAT UNTIL GET loop has been removed: the program assembles the code and executes it immediately.
You'll also notice that we've started adding comments because the code is getting more complicated.
Let's take a look at the string first. This is contained in several statements in line 190, starting with a LF character to create a blank line. The text of the string is in the EQUS statement (you could change it to anything you like!) and is followed by another LF. After this comes a null character (zero) which marks the end of the string. Last of all, we have an ALIGN instruction to ensure that whatever comes next is word-aligned. It's not actually necessary in this case, because nothing follows the string, but it's a good habit to get into.
This version of the program works in the simplest possible way. Register R1 is set up to point to the first character, which is loaded into R0. Note that we use LDRB, not LDR, as we are only loading one eight-bit ASCII character, which goes into the bottom byte of R0. After loading, we increment R1 by one to point to the next character.
We check the character we've just loaded, using the CMP instruction, to see if it is zero. If it is not, we print it and branch back. Note that the instructions in lines 150 and 160 which do this only do so if the character is not zero, due to the NE suffix on their mnemonics. Once the terminating zero has been loaded, we get to line 170 and the program exits.
This isn't actually indexed addressing; we're just using an address in R1 and incrementing it each time we want to read another character. It's possible that we might want to keep R1 pointing to the start of the string, perhaps so that we can load it again. To see how we could do this, look at file Assem09:
70 [OPT pass%
80 ;set up R1 to point to start of text string
90 ADR R1,string
100 MOV R2,#0;set up R2 to index first character of string
110 .loop
120 LDRB R0,[R1,R2];load one character
130 ADD R2,R2,#1;increment R2 ready for next character
140 CMP R0,#0;check for terminating zero
150 ;next two instructions executed only if end of string not yet reached
160 SWINE "OS_WriteC"
170 BNE loop
180 MOV PC,R14
190 .string
200 EQUB &0A:EQUS "This is a string":EQUB &0A:EQUB 0:ALIGN
210 ]
This time, we set up R1 to point to the start of the string and R2 to select an individual character within the string; we say that R2 indexes a character, starting with the one that's zero bytes in (i.e. the first one).
In line 120, we load a byte from the address obtained from the values of R1 + R2. If we haven't reached the terminating zero, we increment R2 for the next character.
70 [OPT pass%
80 ;set up R1 to point to one byte before start of text string
90 ADR R1,string-1
100 .loop
110 LDRB R0,[R1,#1]!;load one character
120 CMP R0,#0;check for terminating zero
130 ;next two instructions executed only if end of string not yet reached
140 SWINE "OS_WriteC"
150 BNE loop
160 MOV PC,R14
170 .string
180 EQUB &0A:EQUS "This is a string":EQUB &0A:EQUB 0:ALIGN
190 ]
Note the pling (!) on the end of the LDRB instruction in line 110. We derive the address to load from by adding the immediate constant (1 in this case) to the value of R1. After doing the loading, this value is written back into R1, due to the presence of the pling. The effect of this is that R1 is incremented each time the instruction is executed.
Because R1 + 1 points to the next character to be loaded, R1 has to be set up initially to point to one byte before the string starts.
An alternative technique is post-indexed addressing, which is used in listing Assem11:
70 [OPT pass%
80 ;set up R1 to point to start of text string
90 ADR R1,string
100 .loop
110 LDRB R0,[R1],#1;load one character, then increment R1
120 CMP R0,#0;check for terminating zero
130 ;next two instructions executed only if end of string not yet reached
140 SWINE "OS_WriteC"
150 BNE loop
160 MOV PC,R14
170 .string
180 EQUB &0A:EQUS "This is a string":EQUB &0A:EQUB 0:ALIGN
190 ]
The instruction in line 110 still has two parameters in its operand, but one of them is now outside the square brackets. In this case, the data to be loaded is pointed to by R1 on its own, and R1 is incremented by having the second parameter added to it after the loading has been done. There is no pling suffix because write back is implicit in post-indexed addressing.
The equivalent in assembly language is the subroutine, called using a BL instruction.
Take a look at the following listing. You won't find it as a file to be run from the CD, for a reason which will become apparent shortly.
10 REM > Assem12a
20 REM use of subroutine to multiply by six
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 100
50 FOR pass%=0 TO 3 STEP 3
60 P%=code%
70 [OPT pass%
80 LDR R0,buf;get number passed from Basic via buffer
90 BL times_six;call subroutine to multiply
100 STR R0,buf;deposit answer in buffer for Basic to find
110 MOV PC,R14
120 ;
130 ;subroutine to multiply value of R0 by six
140 .times_six
150 ADD R0,R0,R0,LSL #1;multiply by three
160 MOV R0,R0,LSL #1;multiply by two
170 MOV PC,R14
180 ;
190 .buf EQUD 0
200 ]
210 NEXT
220 REPEAT
230 INPUT "Give me a number "a%
240 !buf=a%
250 CALL code%
260 PRINT !buf
270 UNTIL FALSE
This is a program to multiply a number, entered by the user, by six and print it on the screen. To avoid writing a long and complicated assembly language program, most of the work is done by Basic and the machine code part just does the multiplication.
The assembled code includes a one-word buffer, pointed to by label buf. Because buf is a Basic variable, is can be used in the Basic part of the program which follows the assembly. After assembling the code, the loop is entered. The number INPUTted is put into the buffer for the machine code to find when it is called. The machine code multiplies the number by six (we'll see how later) and puts the answer back into the buffer for the Basic part to find and print. The program is terminated by pressing Escape.
Looking now at the assembly language, in line 80 the number is loaded into R0 from the buffer, using PC-relative addressing. The following instruction calls the subroutine. You will recall from earlier that the BL instruction causes the address of the next instruction to be copied into R14, known as the link register, so that the program can return to the right point when the subroutine has finished.
Don't worry for now how the subroutine works; we'll look at it later. For now, think of it as a 'black box' which returns a value in R0 six times the original.
When the program returns to line 100, the new value of R0 is stored in the buffer and the machine code part exits, back to Basic.
Have you spotted a problem here? The reason that this particular listing is shown in this article but isn't included in the files to be run is that, if you did run it, your computer would crash.
As far as Basic is concerned, the whole machine code part of the program is a subroutine. When it got to the CALL command, it branched to the assembled code with a BL instruction, putting its return address in R14. We give control back to Basic by copying R14 back into the program counter.
Unfortunately, we've used a BL instruction ourselves in line 90, putting the address of the following STR instruction into R14 and thereby overwriting the return address previously put there by Basic. When we get to the final exit point at line 110, the address of the line 100 instruction will still be there. Instead of handing control back to Basic, the program will jump back to the instruction at line 100 and go into an infinite loop, possibly requiring you to reset your computer.
In effect, we have nested subroutines. We may wish to go further and have subroutines which call other subroutines and so on, so we need a way to store R14 and recover it later. If a subroutine uses other registers, we may wish to store them as well and recover their values when the subroutine finishes.
What we need is a stack: an area of memory used for temporary storage.
In software, our column of blocks is, of course, a section of memory. The stack can grow either upwards from the bottom or downwards from the top. In fact, the analogy starts to break down here because most stacks grow downwards, and if our column of blocks were like a computer stack, it would have to be hanging from the ceiling with new blocks being stuck on the bottom!
The address where new data can be stored on the stack is contained in the stack pointer. It is customary to use R13 for this purpose. It could point to either the first free address or the last address that was used.
If we are running our program from Basic with the CALL command, we can use part of Basic's stack. There should be plenty of it to spare, unless your machine is short on memory and the program only just fits, as it can occupy the space between HIMEM and the top of the variables. If you are writing a program to be run as an Absolute file, like our earlier listing, Assem07, you will have to set up your own block of memory to act as the stack and set R13 to point to it.
STMIA R4!,{R0-R3}To read the data out again, you could use:
LDMDB R4!,{R0-R3}The list of registers between the curly brackets can include individual registers separated by commas, e.g. {R0,R3,R5}, or a continuous range, using a hyphen as in the example above, or a combination of the two.
As we saw earlier, a stack can either start at the bottom of the memory block and work upwards (an ascending stack) or at the top and work downwards (a descending stack). It can be full, where the stack pointer (usually R13) points to the address where the last register was stored; or empty, where R13 points to the first free location.
You can implement these options with the following pseudo-instructions:
| STMEA, LDMEA | empty ascending stack | |
| STMED, LDMED | empty descending stack | |
| STMFA, LDMFA | full ascending stack | |
| STMFD, LDMFD | full descending stack (the most commonly used type of stack) |
The great advantage of these pseudo-instructions is that the same instruction is used for pushing something onto the stack and for pulling it off again, with just the first two letters changing. For example, you can push all the registers except the program counter and stack pointer onto the stack with:
STMFD R13!,{R0-R12,R14}LDMFD R13!,{R0-R12,R14}Now look at file Assem12:
10 REM > Assem12
20 REM use of subroutine to multiply by six and use of stack
30 ON ERROR REPORT:PRINT " at line ";ERL:END
40 DIM code% 100
50 FOR pass%=0 TO 3 STEP 3
60 P%=code%
70 [OPT pass%
80 STMFD R13!,{R14};save R14 on stack
90 LDR R0,buf;get number passed from Basic via buffer
100 BL times_six;call subroutine to multiply
110 STR R0,buf;deposit answer in buffer for Basic to find
120 LDMFD R13!,{R14};restore R14 from stack
130 MOV PC,R14
140 ;
150 ;subroutine to multiply value of R0 by six
160 .times_six
170 ADD R0,R0,R0,LSL #1;multiply by three
180 MOV R0,R0,LSL #1;multiply by two
190 MOV PC,R14
200 ;
210 .buf EQUD 0
220 ]
230 NEXT
240 REPEAT
250 INPUT "Give me a number "a%
260 !buf=a%
270 CALL code%
280 PRINT !buf
290 UNTIL FALSE
As you can see, two extra instructions have been added, at lines 80 and 120. The value of R14, passed to our program by Basic, is now protected by being stored on the stack, so it doesn't matter if we use R14 when calling the subroutine, or make any other use of it, for that matter, provided we pull it off the stack again when we've finished.
Although we're only storing one register on the stack, it is still worth using the 'store multiple' and 'load multiple' instructions to do so because they make it easier to control the stack pointer register, R13. Before R14 is pushed onto the stack, R13 is decremented to point to an unused address for it to go into. We might wish to push more registers onto the stack before we pull R14 off again. They would go into addresses below the one where we just pushed R14, and should be pulled off again (in reverse order) before we pull R14 off at line 120.
In this example, we restore the value of R14, then transfer it to the program counter in line 130 to return to Basic. It's not really necessary to do this as two separate steps; there is no reason why we couldn't pull the return address off the stack and put it straight into the program counter, instead of going via R14. Lines 120 and 130 could therefore be combined into:
LDMFD R13!,{PC}
Repeating the earlier example, we can save all the registers except the program counter and stack pointer at the start of a subroutine with:
STMFD R13!,{R0-R12,R14}LDMFD R13!,{R0-R12,PC}All the work is done in lines 170 and 180. The first line looks very elaborate:
ADD R0,R0,R0,LSL #1Basically, this is an instruction to add R0 (as operand one) to R0 (as operand two) and put the result in R0 (as the destination register); in other words, to double the value of R0. There's an extra bit on the end of the instruction, though: the LSL #1 part.
LSL stands for 'Logical Shift Left', and means that all the bits of operand two are shifted to the left, in this case by one place, replacing the lowest bit with zero. The equivalent of this instruction in Basic would be:
a%=a%+(a%<<1)The effect of shifting all the bits in operand two by one place to the left in binary arithmetic is, of course, to double the number. Adding it to operand one has the overall effect of multiplying the value in R0 by three.
The second line is a bit simpler:
MOV R0,R0,LSL #1This instruction simply replaces the value in R0 by itself, but shifted one place to the left and thus doubled, as in:
a%=a%<<1So the effect of the two instructions is to multiply the value of R0 by six.
There are other types of shift:
LSR ('Logical Shift Right'): all the bits are shifted to the right, the highest bit(s) being replaced by zeros.
ASR ('Arithmetic Shift Right'): like LSR except that the highest bit is replaced by whatever was there before (0 or 1). This is to preserve the sign of signed numbers.
In the shifts listed so far, the bit which 'falls out of the end' of the register is moved into the carry flag.
ROR ('ROtate Right'): the bits are shifted to the right, and bit 0 is copied into bit 31.
RRX ('Rotate Right eXtended;): the same as ROR except that the carry flag acts as an extra bit.
The shift instruction may be followed by either an immediate constant, as in listing Assem12, or a register which contains the number of positions to be shifted.
A full list of these instructions can be found in Guttorm Vik's StrongHelp assembly language manual, which is an excellent reference source for this subject.
In a future article, we'll be looking at assembly language in action by examining the source code for the IClear module. This module enables the text in a writable icon to be cleared and replaced by new text by double-clicking on the icon and typing a new character. We'll be looking at an upgraded version of IClear which will be published here for the first time.
|
|
|
|