Skip to main content
Engineering LibreTexts

9.2: Array Definition and Creation in Assembly

  • Page ID
    27149
  • Most readers of this text will be familiar with the concept of arrays, and using them in a HLL.
    So this chapter will not cover their use, but how arrays are implemented and elements in the array accessed in assembly. Most HLL go to great pains to hide these details from the programmer, with good reason. When programmers actually deal with the details they often make mistakes that have serious consequences to the correctness of their programs: mistakes that lead to serious correctness problems with their programs, and bugs that can often lead to very difficult locate and fix.

    But even though the details of arrays are hidden in most HLL, the details affect how HLL implement array abstractions, and the proper understanding of arrays can help prevent programmers from developing inappropriate metaphors that lead to program issues. Misusing object slicing in C++ or allocating and attempting to use arrays of null objects in Java are issues that can arise if a programmer does not understand true nature of an array.

    The following definition of an array will be used in this chapter. An array is a multivalued variable stored in a contiguous area of memory that contains elements that are all the same size. Some programmers will find that this definition does not fit the definition of arrays in the HLL language which they use. This is a result of the HLL adding layers of abstraction, such as Perl associative array (which are really hash tables) or Java object arrays or ArrayList. These HLL arrays are always hiding some abstraction, and knowing what an array actually is can help with the understanding of how the HLL is manipulating the array.

    The definition of an array becomes apparent when the mechanics of accessing elements in an array is explained. The minimum data needed to define an array consists of a variable which contains the address of the start of the array, the size of each element, and the space to store the elements. For example, an array based at address 0x10010044 and containing 5 32-bit integers is shown in Figure 9-2.

    Figure 9-2: Array implementation

    Screen Shot 2020-07-02 at 2.43.24 PM.png

    To access any element in the array, the element address is calculated by the following formula, and the element valued is loaded from that address.

    elemAddress = basePtr + index * size
    

    where

    • elemAddress is the address of (or pointer to) the element to be used.
    • basePtr is the address of the array variable
    • index is the index for the element (using 0 based arrays)
    • size is the size of each element

    So to load the element at index 0, the elemAddress is just (0x10010044 + (0 * 4)) = 0x10010044, or the basePtr for the array25. Likewise to load element the element at index 2, the elemAddress is (0x10010044 + (2 * 4)) = 0x1001004C.

    Two array examples follow. The first creates an array named grades, which will store 10 elements each 4 bytes big aligned on word boundaries. The second creates an array named id of 10 bytes. Note that no alignment is specified, so the bytes can cross word boundaries.

    .data.
    .align 2
    grades: .space 40
    id: .space 10
    

    To access a grade element in the array grades, grade 0 would be at the basePtr, grade 1 would be at basePtr+4, grade 2 would be at basePtr + 8, etc. The following code fragment shows how grade 2 could be accessed in MIPS assembly code:

    addi $t0, 2        # set element number 2
    sll $t0, $t0, 2    # multiply $t0 by 4 (size) to get the offset
    la $t1, basePtr    # $t1 is the base of the array
    add $t0, $t0, $t1  # basePtr + (index * size)
    lw $t2, 0($t0)     # load element 2 into $t2   
    

    Addressing of arrays is not complicated, but it does require that the programmer keep in mind what is an address verses a value, and to know calculate an array offset.

    9.2.1 Allocating arrays in memory

    In some languages, such as Java, arrays can only be allocated on the heap. Others, such as C/C++ or C#, allow arrays of some types to be allocated anywhere in memory. In MIPS assembly, arrays can be allocated in any part of memory. However remember that arrays allocated in the static data region or on the heap must be fixed size, with the size fixed at assembly time. Only heap allocated arrays can have their size set at run time.

    To allocate an array in static data, a label is defined to give the base address of the array, and enough space for the array elements is allocated. Note also that the array must take into account any alignment consideration (e.g. words must fall on word boundaries). The following code fragment allocates an array of 10 integer words in the data segment.

    .data
        .align 2
        array: .space 40
    

    To allocate an array on the stack, the $sp is adjusted so as to allow space on the stack for the array. In the case of the stack there is no equivalent to the .align 2 assembler directive, so the programmer is responsible for making sure any stack memory is properly aligned. The following code fragment allocates an array of 10 integer words on the stack after the $raregister.

    addi $sp, $sp, -44
    sw $ra, 0(sp)
    # array begins at 4($sp)
    

    (basePtr + 0), and so the number of the elements has more to do with how arrays are implemented, than in semantic considerations of what the elements numbers mean.

    Finally to allocate an array on the heap, the number of items to allocate is multiplied by the size of each element to obtain the amount of memory to allocate. A subprogram to do this, called AllocateArray, is shown below.

    Program 9-2: AllocateArray subprogram
    
    # Subprogram:    AllocateArray
    # Purpose:       To allocate an array of $a0 items, each of size $a1.
        # Author:    Charles Kann
        # Input:     $a0 - the number of items in the array
        #            $a1 - the size of each item
        # Output:    $v0 - Address of the array allocated
        
        AllocateArray:
            addi $sp, $sp, -4
            sw $ra, 0($sp)
            
            mul $a0, $a0, $a1
            li $v0, 9
            syscall
            
            lw $ra, 0($sp)
            addi $sp, $sp, 4
            jr $ra
    

    25 This calculation of the array address will make it apparent to many readers why arrays in many languages are zero based (the first element is 0), rather than the more intuitive concept of arrays being 1 based (the first element is 1). When thought of in terms of array addressing, the first element in the array is at the base address for the array

    • Was this article helpful?