Pointers and arrays are the most important concepts in the C programming language. The power of C comes from pointers; the ease with which pointers can be used to access parts of data structures and the established programming idioms related to pointers reiterate this point. As we shall see in this post, pointers and arrays are equivalent to a large extent.
Table of Contents
1.0 Pointers
A pointer variable holds the address of another variable. It is said to be “pointing” to that variable. For example,
int *ip;
defines a variable
2.0 Using pointers in expressions
The unary & operator applied on a variable gives the address of the variable. The address of a variable can be assigned to a pointer variable, making the pointer variable point to it. For example,
int x; int *ptr; ptr = &x; // ptr points to x
If we assign the address of integer x to integer pointer ptr, ptr points to x. At this point *ptr and x, both evaluate to the value of x. *ptr can be used in any context where an integer could be used. In fact, the definition,
int *ptr;
says that *ptr is an integer. An interesting use case is incrementing or decrementing a variable using a pointer. How do we ensure that the variable, and not the pointer, gets incremented or decremented? Suppose we have a pointer, ip, which points to an integer and we wish to increment that integer.
++*ptr; // increments *ptr, the variable pointed by ptr *ptr++; // increments ptr (*ptr)++; // increments *ptr, the variable pointed by ptr
Unary operators like ++, —, & and * have the same precedence but they associate right to left. So, if we scan ++*ptr from right, we first get *ptr, which is incremented when ++ operator is applied to it. In case of *ptr++, we first get ptr++, which increments the pointer and, now, ptr points to the next integer. Then, *ptr gives the integer pointed by the new value of ptr. This is corrected by putting parentheses around *ptr so that (*ptr)++ increments the integer pointed by ptr.
3.0 Pointer initialization and assignment
There are certain operations permitted for pointers. Pointers can be initialized to 0, or the equivalent symbolic constant NULL.
int *ip = NULL;
A pointer variable with value NULL points nowhere. If two pointer variables point to the same base type, they can be assigned to one another. For example,
int i = 7, j = 0; int *ptr1, *ptr2; ptr1 = &i; // ptr1 points to i ptr2 = ptr1; // Now, ptr2 also points to i printf ("%d\n", *ptr2); // prints 7
4.0 Arrays
An array is a data structure with elements of the same type stored in consecutive locations. The array elements are identified by the array index. The index of the first array element is 0 and the elements are stored with increasing memory address. So an array of 10 integers can be represented in the memory as,
We can initialize an array by providing values of elements in braces and putting it with the assignment symbol in the definition, as below.
int arr [10] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20};
There is an equivalence of arrays and pointers. The array name arr is the pointer to the first element of the array and *arr gives the value of the zeroth element. However arr is a constant and expressions like arr++ or arr– are illegal. But, if we assign arr to a variable of type pointer to integer, we can definitely increment or decrement the latter for array indexes 0 to size of the array plus 1.
int *ptr; ptr = arr;
ptr points to arr[0], ptr+1 points to arr[1], ptr+2 points to arr[2], and so on. When we increment a pointer, the actual address stored in the pointer variable is incremented by (increment * sizeof (base type)) bytes. The expressions arr, arr+1, arr+2, etc., do the same. However, unlike arr, we can use expressions like ptr++ or ptr– in the case of ptr.
Fig. 2: Array and pointer equivalence
Furthermore, we can also write the pointer ptr with the array subscript. The expressions, ptr[0], ptr[1], ptr[2], etc. refer to the array elements 0, 1, 2, etc, respectively of the array, arr. This pointer and array equivalence helps in passing arrays to functions. A calling function needs to pass the pointer to the element 0, and the called function can access the array using the pointer.
5.0 Pointer operations
Pointers are different from basic types. So it is important to enumerate the operations that can be done on them. First, we can initialize a pointer to zero (NULL). C guarantees that zero is never a valid data address. So, a pointer variable initialized to zero points nowhere. Then, this is the important part of pointer arithmetic. We can add or subtract an integer to or from a pointer. The actual value of the pointer variable in bytes is automatically scaled so that the pointer points to that many values ahead or behind the current value. Suppose, we add 1 to a pointer to an integer. The pointer variable value is incremented by the size of an integer and the pointer points to the next integer. If p and q are pointers to elements of the same array, we can apply any of the relational operators, <, <=, >, >=, == and !=, between them. If p and q are pointers, pointing to the same type, we can assign either of them to the other. And, finally, we can compare a pointer with zero (NULL) for equality or inequality.
6.0 Character arrays and Strings
Character arrays appear very frequently in C programs. We can use the traditional notation for character arrays like,
char arr [] = {'H', 'e', 'l', 'l', 'o', ',',' ', 'W', 'o', 'r', 'l', 'd', '!', '\0'};
Fortunately, it is not necessary. C provides character strings. A string is an array of characters, with the last character of the array being the null character. When the program processes a string and encounters the null character, it knows that the end of the string has been reached. We can write the string in running text and the compiler automatically adds the null character at the end. So the above array can be written as a string as below.
char arr [] = "Hello, World!";
This is an example of a variable string and the characters of the string can be modified. We can also have a string constant, as below.
char *error_msg = "To err is humane, to forgive divine.";
error_msg is a pointer to the string constant, “To err is humane, to forgive divine.” The pointer is a variable; it can be changed to point to some other string. But the string is a constant, and should never be changed. If it is changed, the results are undefined. Unfortunately, if we try to change the string, the compiler does not detect it; so we get a run-time error. We can modify the definition of error_msg, applying the const qualifier, so that any attempt to modify the constant string would result in a compile-time error.
const char *error_msg = "To err is humane, to forgive divine.";
Now the constant string cannot be changed, but what about the pointer error_msg? We can still modify the pointer error_msg and if we do that, we cannot access the original constant string anymore. So we should make it clear to the compiler that both the pointer and the string are constants. We can do that with the definition,
const char *const error_msg = "To err is humane, to forgive divine.";
As an example, we have a function, string_copy, which copies source string to destination, much like the library function, strcpy. And the code for the string_copy function is as given below.
void string_copy (char *dest, const char *src) { while (*dest++ = *src++) ; }
7.0 Array of Pointers
Just as we have arrays of base types like integer, double, etc., we can have an array of pointers.
Fig. 3: An array of 10 pointers to characters
The figure shows an array of ten pointers to characters. Pointers with arrows point to character strings. Pointers with crossed circles are null pointers; they do not point anywhere. Arrays of pointers are common in C programs. As an example, consider the program to print the month of a year, based on month value as integer. A month with integer value 1 indicates January, 2 indicates February, and so on. The month name in characters are stored as constant strings and we have an array of pointers to char for mapping an integer to corresponding month in text. The program is as given below.
#include <stdio.h> #include <stdlib.h> #include <string.h> const char * const month_name [] = {"Illegal month value", "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"}; const char *month (int mm) { return month_name [(mm < 1 || mm > 12) ? 0 : mm] ; } int main () { int mon; while (scanf ("%d", &mon) != EOF) printf ("%s\n", month (mon)); }
8.0 Pointer to a pointer
We have seen that arrays and pointer are equivalent such that we can process an array like
int arr [10];
with a pointer,
int *ip = arr;
Similarly, if we have an array of character pointers, like
char *month_name [12];
we can define an equivalent pointer, like
char **month_ptr = month_name;
month_ptr is a pointer to another pointer. It points to the zeroth location of array month_name, which is itself a pointer and points to the zeroth character of a string.
Figure 4: Pointer to another pointer
Using the month_ptr, we can process the month_name array as shown in the example below.
#include <stdio.h> #include <stdlib.h> #include <string.h> int main () { char *month_name [] = {"January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December", NULL}; char **month_ptr = month_name; // print a list of months in a year while (*month_ptr) printf ("%s\n", *month_ptr++); }
The last pointer in the array month_name is a null pointer. This helps in processing month_name with a pointer, with the null pointer signalling the end of array.
9.0 Command-line Arguments
The main function of a C program gets two arguments, an integer argc and an array of strings argv. That is, the main function is called as,
int main (int argc, char *argv [])
argc stands for argument count, the number of arguments in the argument vector, argv. By convention, the name with which the program was invoked is argv[0]. So argc is at least 1. The following program prints the command with which it is invoked and all the arguments passed.
#include <stdio.h> #include <stdlib.h> #include <string.h> int main (int argc, char **argv) { while (argc--) printf (argc ? "%s " : "%s\n", *argv++); }
10.0 Pointers to functions
We have pointer to a function, which is a variable pointing to a function. A pointer to a function is defined as
<return type> (*ptr_to_function) (<type> arg1, <type> arg2, ..);
The return type and arguments must match the actual function return type and arguments. There are two pairs of parentheses, one around the pointer identifier preceded by *, and the other around the arguments. For example, consider the call for creating a thread,
#include <pthread.h> int pthread_create (pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);
start_routine is a pointer to the function to be executed by the thread created by the pthread_create call and must match the function prototype,
void *start_routine (void *arg);
As another example, consider the sigaction system call for changing the current signal action of a process.
#include <signal.h> struct sigaction { void (*sa_handler) (int); void (*sa_sigaction) (int, siginfo_t *, void *); sigset_t sa_mask; int sa_flags; void (*sa_restorer) (void); }; int sigaction (int signum, const struct sigaction *act, struct sigaction *oldact);
Both sa_handler and sa_sigaction are pointers for functions to be installed as signal handlers. sa_handler must match the function prototype,
void signal_handler_fcn (int signum);
And, sa_sigaction must match the function prototype,
void signal_handler_fn (int signum, siginfo_t *siginfo, void *context);
A pointer to a function can be assigned the function name. For example,
... void sig_handler (int signum); ... int main (int argc, char **argv) { ... struct sigaction act; memset (&act, 0, sizeof (act)); // set signal handler for following signals act.sa_handler = sig_handler; if (sigaction (SIGINT, &act, NULL) == -1) syserror ("sigaction"); ...
11.0 Reference
Brian W. Kernighan and Dennis M. Ritchie, "The C Programming Language", Second Edition, Pearson, 1988.