2.2: Assembly Language Caveats

Last updated
Save as PDF

Page ID: 27239

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Programmers who have learned higher level language, such as Java, C/C++, C#, or Ada, often have developed ways of thinking about a program that are inappropriate for low level languages and systems such as assembly language. This section will give some suggestions to programmers approaching assembly language for the first time.

The first thing to consider is that all instructions should implement primitive operations. Higher level languages allow a short hand that implies many instructions. For example, the statement B=A+5 implies load operations that ready variables A and 5 to be sent to the ALU. Next an add operation by the ALU is to be performed. Finally an operation to store the result of the ALU back to variable B needs to be executed. In assembly the programmer must specify all of the primitive operations needed. There are no shortcuts.

The second thing to consider is that despite what you might have heard about goto statements being bad, there is no way to implement program control such as if statements or loops without using a branch instruction, which is the equivalent to a goto statement. This does not mean that structured programming constructs cannot be used effectively. If a program is confused about how to implement structured programming constructs in assembly, there is a chapter in a free book on MIPS assembly program written by the author of this monograph that explains how this can be accomplished.

The third important point about assembly language is that data has no context. In a higher level language normally the variables A and B, and the number 5, are specified as integers. The higher level language knows that these are integers, and then provides a context to interpret them. The add operation is known to be an integer add, and the compiler will generate an instruction to do the integer option and not a floating point operation. If the declaration for the numbers was changed to float, the add operation in the higher level language would be changed to a floating point add. The higher level language knows the type variables, and can provide the proper context for interpreting them.

In assembly language, there is no context for any data. Data can be an integer, a Boolean value, a floating point number, ASCII characters, or even program instructions. The assembler has no idea of a type, and simply will execute the operation specified. It is possible in assembly to do meaningless operations, such as adding two program instructions together. Assembly language will gladly let you do meaningless and completely inane things, and will in no way warn you that it is meaningless. The assembler has no context for data, and there is no way to correct this problem because from an assembler point of view, there is no problem.

When programming in assembly language, it is important that the programmer maintain knowledge of the current program context. It is the programmer who knows if two data elements are integers, and thus an integer add operation is appropriate. It is up to the programmer to be aware if the values being worked with are addresses or values, and to do the proper dereferencing operations. There is nothing but the knowledge of the programmer to ensure that a program will execute correct operations on the proper datatypes.