The Point of Programming

Pointers are the best programming concept I’ve learned in my time at university so far.

These little things form the building blocks for a large range of more complex and essential data structures used in contemporary programming, and they do it so nonchalantly, one cannot help but fall slightly in love with how useful they are.

I remember struggling to wrap my head around how this stuff works when I was first introduced to it, and I hope this little piece of writing helps you if you’re in a similar situation.

I will be using C/C++ to demonstrate pointers, as this is the language in which most people are introduced to them.

What is a pointer?

A pointer is a variable whose value is the direct address of the memory location of another variable.

int i;
int *j;
j = &i;

In the code block above, two variables are declared: i and j.
‘i’ is of type int, which stands for integer, and ‘j’ is of type int *, which stands for integer pointer.

The final line in the code block assigns the address of ‘i’ to ‘j’, which we indicate with ‘&i’.

Now might be a good time to cover reference (&) and dereference (*) operators in C/C++:

The reference operator returns the address of the location of the variable in the memory.

int i = 4;
printf("Address:%p", &i);

Output:
Address:0x7ffef0a4df0c

The dereference operator takes in a pointer and returns the value of the variable it points to.

int i = 4;
int *j = &i;
printf("Value: %i", *j);

Output:

Value: 4

If you’re wondering what happens when you try to dereference something that isn’t a pointer, you get this lovely message when trying to compile your program:

error: invalid type argument of unary ‘*’ (have ‘int’)

The (have ‘int’) means you tried to dereference an integer, which isn’t right.

It’s also important to note something here:

int* j;
int * j;
int *j;

All three ways to declare a pointer are valid, but I use the third method because:

int* j, k;              // 'j' is a pointer, 'k' is not.
int *j, *k;            // Both 'j' and 'k' are pointers.

The ‘pointer’ part of the declaration belongs to the variable, not the data type.

It’s also possible to have a pointer to a pointer which results in something like this:

int i = 4;
int *j = &i;
int *k = &j;
printf("ValueP: %i", *j);
printf("Value: %i", **k);

Output:


ValueP: 4
Value: 4

You could have a pointer to a pointer to a pointer….and so on.

With this handy technical diagram, we can follow how referencing and dereferencing work.

It’s also fairly common for developers to dereference a pointer when they need to access or change the variable it points to, and that’s where it gets slightly confusing:

int i = 4;
int *j = &i;
printf("Before: %i", i);
*j = 5;
printf("After: %i", i);

Output:

Before: 4

After: 5

When using ‘*j’ in the declaration of a pointer, we set it equal to ‘&i’ (address of ‘i’), and when using ‘*j’ outside of the pointer declaration, we dereference to set ‘i’ equal to ‘5’.

Why are pointers so useful?

Pointers allow direct manipulation of memory addresses. This allows developers to do neat things like:

Pass by reference: Every time a value is passed as an argument into a function, it can be copied by value. This is not as useful when you want to directly change the value of variables inside functions, or when you want to optimize data. Pointers allow you to directly access the value of the variables without having to create copies.
Dynamic data structures: Pointers allow us to create stuff like linked lists and trees that can change size and shape during execution, which affords more flexibility and efficient memory usage. More on this in a bit.
Low-level programming: Developing operating system components or device drivers requires the developer to control memory addresses, and pointers allow for that. This leads to proper optimization and improved performance.

Let’s talk a bit about those dynamic data structures:

Linked lists: A linked list is a linear data structure in which each element (called a “node”) contains a value and a pointer to the next element in the list. They’re kinda like the snake in Snake (game). Just like the snake, a linked list is made of smaller blocks connected sequentially. They can grow on one end or both ends. Each node in the linked list can only be linked to at most two other nodes. Linked lists are dynamic and can be easily inserted or deleted, but are slower to access than arrays because they require following multiple pointers to find a particular element.

Trees: A tree is a hierarchical data structure in which each node has one or more child nodes. As the name implies, the data structure resembles an (upside-down) tree when represented pictorially. Trees are often implemented using pointers, with each node containing a value and pointers to its children, that resemble branches. Trees can be used to store and retrieve data efficiently, and are often used in search algorithms and database indexing.

Graphs: A graph is a data structure that consists of a set of vertices (nodes) and edges connecting the vertices. While linked lists can only be linked sequentially, graphs listen to no such restrictions. A node in the graph can be connected to any other node in the graph and there’s no restriction on how many nodes it can be connected to. Graphs can be implemented using pointers, with each vertex containing a value and pointers to its neighboring vertices. Graphs are often used to represent networks or relationships between data. You can think of the network of friends on Facebook as a good example of an undirected graph. You are your friend’s friend, and your friend is your friend, so an edge in this graph represents both relationships at the same time.