About C for Yourself

C for Yourself - part 5

David Matthewman starts to look at arrays, pointers and strings in C. Pointers are a tricky subject, and will be explored in serveral stages over the next few issues.

Kate, a photographer, wants to keep track of the photographs that she takes. She needs a database giving, for each photograph, the exposure, aperture, place, filter and date that the photograph was taken. She starts by writing a program to keep track of the exposure information for a roll of 35mm film, which may have up to 36 exposures.

Obviously this can be done by using a set of variables exposure1, exposure2, exposure3 to exposure36. However, Kate will then find it impossible to refer easily to the nth exposure, where n is a number calculated by the program and not known beforehand. The variable exposuren is recognised by the program as a separate variable, not as the nth exposure variable. What she needs, as programmers in other languages will already have realised, is an array.

Variable lists

An array variable is really a whole list of variables which can be accessed by number. Instead of the variable list above, you have a variable list exposure[1], exposure[2], exposure[3] to exposure[36], with the nth member of the list being accessed as exposure[n].

Each member of the array is used as an individual variable in expressions, loops, print statements and so on.

The type of each element of the array - which governs the type of the array as a whole - is given in the declaration for the array.

Declarations were explained in the October issue, and are statements which give the C compiler important information about a variable; how much memory to reserve for it, what operations may be performed on it and so on. In this case, Kate decides to keep the exposure information as an array of integers, to save on space and make any calculations quicker.

A negative number will indicate an exposure of the reciprocal of the number; -30 means an exposure of one thirtieth of a second. She declares the array with the following statement:

int exposure[40];

which is one of two important ways of declaring arrays. We will look at the second way later. The square brackets after the array identify it as an array; the number in the brackets gives the array size. Arrays in C are numbered from zero up to the number given in brackets minus one. Incidentally, Kate has noticed that on a typical 36-exposure film, up to 40 exposures can actually be taken, and has sized the array appropriately.

Kate initialises a second array, aperture, to take the aperture information for the film. She could initialise this array as a float array, since aperture sizes take floating point values. She is wary of doing this, worried that her aperture of 1.7 will print out as 1.6999999999 due to lost precision and make her screen display untidy. Since she doesn't envisage herself doing any calculations with the numbers, she decides instead to store them as two integers representing the decimal part and the integer part.

She can do this all in one array if she initialises a two-dimensional array. If a one-dimensional array of the sort we have already encountered is a list, a two-dimensional array is a table, where an element item[5][3] is the fifth item in the third row down (ignoring for the moment the possibility of a row zero and a column zero). Such arrays are defined in a similar manner to before, and Kate defines hers as:

int aperture[40][2];

where aperture[n][0] will be the integer part of the nth aperture and aperture[n][1] the decimal part.

Character arrays

Having set up arrays for the aperture and exposure of each picture on the film, Kate must now do the same for the place that each photo was taken. So far, we have only come across character variables which can hold single characters. Those of you familiar with BASIC programming may well be wondering how C handles string variables.

The answer is that it doesn't, not explicitly. Instead, it uses arrays of characters, which are equivalent.

The business of string handling and manipulation is one which we will cover in a later issue. Suffice to say that, because string variables as such do not exist, operations such as copying a string, comparing strings and 'adding' one string to another are not simple. Rather than write string1 = string2, we must use a function and write strcpy(string1, string2).

Initialising a string is a simple array initialisation of an array of characters as in:

char string[25];

In this case it is important to remember that C arrays start from zero, as string[3] will therefore be the fourth character in a given string.

All strings in C are terminated by a zero byte, so are one byte longer than the number of characters in the string anyway. Hence to store a four-letter word an initialisation:

char string[5];

is needed. In a string containing the word 'RISC', string[0] will contain 'R', string[1] will contain 'I', string[2] will contain 'S', string[3] will contain 'C', and string[4] will contain zero, commonly written as '\0'.

Kate can initialisae her array of places as follows:

char place[40][100];

She can, however see a problem with this method. She has allowed place names of up to 99 characters to allow for complicated descriptions such as 'The view over Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch at sunset looking south", but realises this is overkill for photos such as 'Loco 37401 at Oban'. At one byte per character, her array will take up 4000 bytes per film, most of which will be wasted.

Pointers

What Kate needs is a method of declaring an array without actually saying how big it is in advance.

Since the C compiler will not allocate memory to a variable until it knows how big it is, we need a declaration that tells the compiler about a variable without making it allocate memory for the variable. Unsurprisingly, this can be done easily, but before we see how, let us consider a string variable string1 more closely.

The variable string1 has been declared by the statement:

char string1[10];

What occurs if we try to use the variable string1 without the square brackets? In Basic it would be treated as a totally separate variable, but in C the variable string1 is irrevocably tied to the array string1[]; it holds the address of the first element of the array. In essence, the variable string1 and the array string1[] are two ways of looking at the same object. The variable string1 is called a pointer.

We must now be introduced to two unary operators associated with pointers: * and &. * means 'the object pointed to by' and & means 'the pointer to' or 'the address of'. Hence &(string1[0]) is equivalent to string1 and *string1 is equivalent to string1[0].

If you understand that last sentence, then you are most of the way to understanding arrays and pointers in C. If it still seems comprehensible, don't worry. Pointers are a very difficult area of C to understand, but if you practice using them you will get the hang of them in the end. This is not the last that we will have to say about pointers, and we will continue to treat them with due respect over the coming issues.

The use of the * operator leads to an alternative method of declaring the array string1. It can be declared by the line:

char *string1;

which declares string1 to have type 'pointer to char', which is equivalent to a character array. If you remember that a definition:

char char1;

would read 'char1 is a character variable', then you can see that:

char *string1;

would read 'the object pointed to by string1 is a character variable', which is the same as saying that string1 is a pointer to a character variable.

So far we have only declared string1. We have not yet given it a value - a block of memory to point to - so we cannot use it yet. We will find out how to do this next month.

Source:	Acorn User - 149 - December 1994
Publication:	Acorn User
Contributor:	David Matthewman