10.4: Processing a Character Array On the Stack

Last updated
Save as PDF

Page ID: 76145

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

The next example processes an array of characters that are allocated on the stack. In C and many lower level languages, a string is often represented as an array of characters, each character being one byte, that is terminated with a null value (a byte containing 0x00). This is often called a null terminated string. The string “Hello” would be stored in memory, in little endian format, as follows:

l	l	e	H			0x00	o

The main program allocates 40 bytes on the stack for the array, and calls the printStringByIndex function to print each character in the string.

.global printStringByIndex 
.global main 

.text 
printStringByIndex: 
    #push stack 
    SUB sp, sp, #12 
    STR lr, [sp, #0] 
    STR r4, [sp, #4] 
    STR r5, [sp, #8] 
    
    # Save Base Array to preserved register 
    MOV r4, r0 
    
    # initialize loop for entering data 
    # r4 - array base 
    # r5 - loop index 
    
    MOV r5, #0 
    startPrintLoop: 
        MOV r0, #0 
        LDRB r1, [r4, r5] 
        CMP r0, r1 
        BEQ endPrintLoop 
        
        LDR r0, =output 
        MOV r1, r5
        ADD r2, r4, r5 // Calculate the array address 
        LDRB r2, [r2, #0] 
        BL printf 
        
        ADD r5,r5, #1 
        B startPrintLoop 
    endPrintLoop: 
    
    #pop stack 
    LDR lr, [sp, #0] 
    LDR r4, [sp, #4] 
    LDR r5, [sp, #8] 
    ADD sp, sp, #12 
    MOV pc, lr

.data 
    output: .asciz "The value for element [%d] is %c\n"
    
#end printStringByIndex 

# Main procedure to test printArrayByIndex 
.text
main: 
    #push stack 
    SUB sp, sp, #44 
    STR lr, [sp, #0] 
    # string is at sp+4 ... s+43 
    
    # load string 
    LDR r0, =prompt 
    BL printf 
    LDR r0, =format 
    ADD r1, sp, #4 
    BL scanf 
    
    # reload base and call function 
    ADD r0, sp, #4 
    BL printStringByIndex 
    
    #pop stack 
    LDR lr, [sp, #0] 
    ADD sp, sp, #4 
    MOV pc, lr 

.data 
    prompt: .asciz "Enter input string: " 
    format: .asciz "%s"

10.4.1 Comments on program

The first comment on this program is illustrated in the following code that shows how the space for the string is allocated. In these lines it can be seen that the stack pointer is moved to allow enough space for 40 bytes to be saved on the stack for the string. Note that even though this memory is dynamically allocate (e.g. allocated at runtime), the amount of spaces must be specified at compile time to allow the push and pop for the stack to be implemented. Thus like the static array allocation, stack allocations must be specified at compile time and cannot be changed.

    #push stack 
    SUB sp, sp, #44 
    STR lr, [sp, #0] 
    # string is at sp+4 ... s+43

The next two points are illustrated in the loop that processes the string.

MOV r5, #0 
startPrintLoop: 
    MOV r0, #0 
    LDRB r1, [r4, r5] 
    CMP r0, r1 
    BEQ endPrintLoop 
    
    LDR r0, =output 
    MOV r1, r5 
    ADD r2, r4, r5 // Calculate the array address 
    LDRB r2, [r2, #0] 
    BL printf 
    
    ADD r5,r5, #1 
    B startPrintLoop 
endPrintLoop:

The first point is that this loop reads each character as a byte using the LDRB (load byte) instruction. Each byte that is loaded corresponds to one character. Because bytes are being loaded, the address of the item is not multiplied by 4 as was done for word addressing, as shown in the code fragment below:

    ADD r2, r4, r5 // Calculate the array address 
    LDRB r2, [r2, #0]

Finally, this loop does not know how long the string is, but it does know that the string ends with a NULL character. Therefore this string reads each byte (character) in the loop until a NULL valid is found, and then exits the loop, as shown in the code fragment below:

startPrintLoop: 
    MOV r0, #0 
    LDRB r1, [r4, r5] 
    CMP r0, r1 
    BEQ endPrintLoop

Finally, note that the printf command uses a value for the character to print. This is the same as printing an integer value, but different than printing a string, where a pointer to the string is used.