Often in a large program, you will separate out code into
multiple files to keep related functions together. Each of
these files can be compiled into object code: but your final
goal is to create a single executable! There needs to be some
way combining each of these object files into a single
executable. We call this linking.
Note that even if your program does fit in one file it still
needs to be linked against certain system libraries to operate
correctly. For example, the
printf call is kept in a library
which must be combined with your executable to work. So although
you do not explicitly have to worry about linking in this case,
there is most certainly still a linking process happening to
create your executable.
In the following sections we explain some terms essential to
Variables and functions all have names in source code
which we refer to them by. One way of thinking of a statement
declaring a variable
int a is
that you are telling the compiler "set aside some memory of
sizeof(int) and from now on
when I use
a it will refer to
this allocated memory. Similarly a function says "store this
code in memory, and when I call
function() jump to and execute
In this case, we call
symbols since they are a symbolic
representation of an area of memory.
Symbols help humans to understand programming. You could
say that the primary job of the compilation process is to remove
symbols -- the processor doesn't know what
a represents, all it knows is
that it has some data at a particular memory address. The
compilation process needs to convert
2 to something like "increment the value in
0xABCDE by 2.
In some C programs, you may have seen the terms
extern used with variables.
These modifiers can effect what we call the visibility of
Imagine you have split up your program in two files, but
some functions need to share a variable. You only want one
definition (i.e. memory location) of the
shared variable (otherwise it wouldn't be shared!), but both
files need to reference it.
To enable this, we declare the variable in one file, and
then in the other file declare a variable of the same name but
with the prefix
extern stands for
external and to a human means that this
variable is declared somewhere else.
extern says to a
compiler is that it should not allocate any space in memory for
this variable, and leave this symbol in the object code where it
will be fixed up later. The compiler can not possibly know
where the symbol is actually defined but the
linkerdoes, since it is its job to look at
all object files together and combine them into a single
executable. So the linker will see the symbol left over in the
second file, and say "I've seen that symbol before in file 1,
and I know that it refers to memory location
0x12345". Thus it can modify
the symbol value to be the memory value of the variable in the
static is almost the
extern. It places
restrictions on the visibility of the symbol it modifies. If you
declare a variable with
that says to the compiler "don't leave any symbols for this in
the object code". This means that when the linker is linking
together object files it will never see that symbol (and so
can't make that "I've seen this before!" connection).
static is good for separation
and reducing conflicts -- by declaring a variable
static you can reuse the
variable name in other files and not end up with symbol clashes.
We say we are restricting the visibility of
the symbol, because we are not allowing the linker to see it.
Contrast this with a more visible symbol (one not declared with
static) which can be seen by
Thus the linking process is really two steps; combining
all object files into one executable file and then going through
each object file to resolve any symbols.
This usually requires two passes; one to read all the symbol
definitions and take note of unresolved symbols and a second to
fix up all those unresolved symbols to the right place.
The final executable should end up with no unresolved
symbols; the linker will fail with an error if there are