Skip to main content
Engineering LibreTexts

5.5: Representing strings

  • Page ID
    40736
  • Related issues sometimes come up with strings. First, remember that C strings are null-terminated. When you allocate space for a string, don’t forget the extra byte at the end.

    Also, the letters and numbers in C strings are encoded in ASCII. The ASCII codes for the digits “0” through “9” are 48 through 57, not 0 through 9. The ASCII code 0 is the NUL character that marks the end of a string. And the ASCII codes 1 through 9 are special characters used in some communication protocols. ASCII code 7 is a bell; on some terminals, printing it makes a sound.

    The ASCII code for the letter “A” is 65; the code for “a” is 97. Here are those codes in binary:

    65 = b0100 0001
    97 = b0110 0001
    

    A careful observer will notice that they differ by a single bit. And this pattern holds for the rest of the letters; the sixth bit (counting from the right) acts as a “case bit”, 0 for upper-case letters and 1 for lower case letters.

    As an exercise, write a function that takes a string and converts from lower-case to upper-case by flipping the sixth bit. As a challenge, you can make a faster version by reading the string 32 or 64 bits at a time, rather than one character at a time. This optimization is made easier if the length of the string is a multiple of 4 or 8 bytes.

    If you read past the end of a string, you are likely to see strange characters. Conversely, if you write a string and then accidentally read it as an int or float, the results will be hard to interpret.

    For example, if you run:

        char array[] = "allen";
        float *p = array;
        printf("%f\n", *p);
    

    You will find that the ASCII representation of the first 8 characters of my name, interpreted as a double-precision floating point number, is 69779713878800585457664.

    • Was this article helpful?