Assembly Language - Using BASIC Variables

The easiest way to pass data from a parent Basic program to an assembler routine is to use the 'resident integer variables', A% to H%, since the current values of these variables are poked into ARM registers 0 to 7 when the Basic CALL statement is executed. By the way, it may be that, like me, you use variables to 'name' your registers - you may have somewhere at the start of your program a statement like

current_decrement%=3:counter%=0

and then write

SUB  counter%,counter%,current_decrement%

instead of

SUB  R0,R0,R3

If so, beware of later finding yourself attempting to initialise R0 by writing

counter%=start_value%

and not

A%=start_value%

If you are lucky, start_value% will not be a valid ARM register number. If you are unlucky, the machine code will be assembled and run using the wrong register, and the whole computer will lock up and crash. Has anybody else ever done anything this stupid, or am I unique?

Labels in the Basic assembler

It's also worth knowing that 'labels' in the Basic assembler are actually implemented as integer variables, whether they end in '%' or not (which could explain some weird 'Unknown or missing variable' error messages!) - you can label a data area as .input, say, execute a PROCassemble at the start of your program, and then use statements such as

input!0=variable1%
input!4=variable2%
input!8=anothervariable%

to pass data to your machine code routine each time you call it, whence it can be accessed using statements such as

LDR  R0,input
LDR  R1,input+4
        ; Note, not '#4' - 'input+4'
        ; is a Basic statement which
        ; will be evaluated by the
        ; assembler, not an op-code
LDR  R7,input+8

Of course, the updated contents of this shared data block are then available to be read and manipulated from Basic when the machine code exits. This technique only works when code is being assembled 'live' within your program, though. If you prefer to load in pre-assembled blocks of code - whether in order to save time, decrease the size of your source code, or reduce the overheads on the Basic name space table caused by the presence of vast numbers of pseudo-integer variables - you won't have access to the address values of your label names in this manner. (You can do it, by calculating the offset of your data area from the start of the DIMmed block into which your machine code has been loaded - but you have to be ultra- careful.)

If your machine code needs access to strings, remember that you can always use

.stringinput   EQUS STRING$(60,CHR$(0))
              ;reserve 60 zero bytes

to allocate space within the assembler, and

$stringinput="Test data"

to poke CR-terminated strings (an extra CHR$(13) will be appended to the last byte) into it from the parent Basic program.

And a final handy trick: you can call the *BreakSet command in the Debugger module from Basic to halt the execution of the machine code when it reaches any given label, using for example

OSCLI "BreakSet "+STR$~(label)

to evaluate .label and pass it as the hexadecimal string required by the command line interpreter.

Using Basic user functions from the assembler

One advantage of using the Basic assembler is that you can use functions like FNmessage() below to insert frequently-used bits of code. If a reference to such a function (which should be placed after the main Basic program as normal and contain ARM mnemonics nested within their own square brackets with an OPT statement) is encountered within the assembler, then instead of evaluating the result of the function as usual, the Basic interpreter simply inserts that little section of code and continues assembly from where it left off.

Note that this isn't the same as an actual machine code subroutine which can be jumped to from several places during the execution of the program (but only actually takes up a single block of memory); every time you call such a function you insert a new section of code. On the other hand, you can use this to your advantage by passing different parameters to your function each time you call it. For example, FNmessage simply inserts a call to OS_WriteS, which prints the zero­terminated string immediately following this instruction in memory, and then pokes in whatever string was supplied to the function as a parameter. Whenever the execution of the program reaches such a point, it prints whatever message was inserted and carries on. We shall be using FNmessage in next month's program listing.

DEF FNmessage(string$)
[OPT pass%
        SWI        "OS_WriteS"
        EQUB        10
        EQUB        13
        EQUS        string$
        EQUB        0
]
=0

The actual result returned by the function is totally ignored by the assembler, so I usually set it to 0. (Given this behaviour, I don't understand why it has to be FNmessage rather than PROCmessage - but PROCmessage in this context isn't permitted.)

The parameters you pass to functions are treated as normal Basic variables, and you may use all the normal Basic operations to manipulate them. Here is an extreme (and rather silly) example, FNadd, which adds together the value held in the register specified and the square of the number passed to it, and returns the result in the next register up!

DEF FNadd(reg%,value%)
value%=value%*value%
[OPT pass%
  ADD   (reg%+1),reg%,#value%
]
=0

If R2 holds the value 6 and the variable constant% equals 8, then after executing the piece of code assembled by a call to FNadd(2,constant%), R3 will hold the value (6+8²) =70. Note that you can't use 'R2' as the parameter - the assembler treats '2' and 'R2' as equivalent, but Basic functions don't recognise 'R2' as a valid integer. Note also that parameters are evaluated at the moment the program is being assembled - in other words, don't expect FNadd to calculate the square of constant% if it later changes. The value #64 will have been hard-wired in.

Passing variables to CALL

The expert's way to pass Basic data into machine code - and the only way, if you want to handle whole arrays - is to append a list of variables as parameters to the CALL statement. The first parameter to the CALL statement is always the address of the routine to call (almost invariably the name of the block we DIMmed to hold it), but you can specify a comma-separated list of further parameters, pointers to which will then be made available to the machine code routine. For example:

CALL ourcode%,block%?5,name$,handle%(4),$data,wimp%!20,whole_array%()

There doesn't seem to be any limit to the number of parameters you can pass, other than the maximum length of a tokenised Basic line.

When your machine code routine is first called, in addition to registers 0 to 7 being initialised from variables A% to H%, R10 holds the number (if any) of variables passed to the routine and R9 holds the address of a 'descriptor block' which gives you access to the current values held by those variables - once you know how to decode it! In BBC Basic, you are restricted to passing variables as parameters to CALL. You can't pass numeric or string values directly.

The parameter block

This operates on a 'Last in First out' basis - the last parameter will always be on the top - which is a nuisance since every time you discover you need to add a new parameter, the offsets to all the previous values change. I find the easiest way to remember the order of the parameters is simply to read the CALL line from right to left.

For each parameter there is an eight-byte entry, where the first word is a pointer and the second word is a type code indicating what kind of data is to be found at that point. As a rough guide to remembering these codes: the first hex digit of the code is 1 if the type is a subset of array, otherwise it's 0. The second hex digit is 8 if the type is a subset of string, otherwise it's 0. The third hex digit gives the length in bytes of each entry of this type (plus the length of the string, of course, if relevant).

Type Example Address points to
&000 ?x Byte-aligned byte
&004 !x,x%,x(n) Word-aligned word
&005 |x,x,x(n) Byte-aligned 5-byte real
&080 x$,$(n) Byte-aligned address-word, length-byte
&081 $x Byte-aligned string, terminated by 13
&104 x%() Word-aligned ptr to integer array
&105 x() Word-aligned ptr to real array
&180 x$() Word-aligned ptr to stringarray

Note, in particular, that string variables are byte-aligned. This means that the address pointed to by the first word of this parameter's entry is probably not on a word boundary (i.e. ending in &0, &4, &8, &C) and therefore you must not attempt to use the LDR instruction to load from this address or the value apparently returned will be complete rubbish.

Every time you DIM a block from Basic, by the way, the reserved memory will always be word-aligned - and thus so will any data you poke into the start of it using an indirection operator ('!','$','?' or '|').


Source: Archive Magazine 14.4
Publication: Archive Magazine
Contributor: Harriet Bazley