5.2: Characters and Character Single-Index Arrays (Strings)

Last updated
Save as PDF

Page ID: 55652

Masayuki Yano, James Douglass Penn, George Konidaris, & Anthony T Patera
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Our focus is on numerical methods and hence we will not have too much demand for character and string processing. However, it is good to know the basics, and there are instances — typically related to more sophisticated applications of software system management, “codes that write codes” (or less dramatically, codes some lines of which can be modified from case to case), and also symbolic manipulation — in which character concepts can play an important role. Note that character manipulation and symbolic manipulation are very different: the former does not attribute any mathematical significance to characters; the latter is built upon the former but now adds mathematical rules of engagement. We consider here only character manipulation.

A character variable (an instance of the character data type), say c, must represent a letter of numeral. As always, c ultimately must be stored (ultimately) by 0’s and 1’s. We thus need — as part of the data type definition — an encoding of different characters in terms of 0’s and 1’s. The most common such encoding, the original ASCII code, is a mapping from 8-bit words (binary numbers) to the set of letters in the alphabet, numerals, punctuation marks, as well as some special or control characters. (There are now many “extended” ASCII codes which include symbols from languages other than English.)

We can create and assign a character variable as

>> c = '3'
c =
    3
>> c_ascii = int8(c)
c_ascii =
    51
>> c_too = char(c_ascii)
c_too =
    3
>>

In the first statment, we enter the single-character data with quotes — which tells Matlab that 3 is to be interpreted as a character and not a number. We can then obtain the ASCII code for the number 3 — which happens to be 51. We can then recreate the character variable c by directly appealing to the ASCII code with the char command. Obviously, quotes are easier than memorizing the ASCII code. A “string” is simply a single-index array of character elements. We can input a string most 83 easily with the quote feature:

>> pi_approx_str = '3.1416'
pi_approx_str =
    3.1416
>> pi_approx_str(2)
ans =
.
>> pi_approx_str + 1
ans =
    52     47     50     53     50     55
>>

We emphasize that pi_approx_str is not of type double and if we attempt to (say) add 1 to pi_approx_str we get (effectively) nonsense: Matlab adds 1 to each element of the ASCIItranslation of the our string according to the rules of single–index array addition.

We can readily concatenate strings, for example:

>> leader = 'The value is '
leader =
The value is
>> printstatement = [leader,pi_approx_str,' .']
printstatement =
The value is 3.1416 .
>>

However, this is of limited use since typically we would know an approximation to π not as a string but as double.

Fortunately, there are some simple conversion functions available in Matlab (and other programming languages as well). The Matlab function num2str will take a floating point number and convert it to the corresponding string of characters; conversely, str2num will take a string (presumably of ASCII codes for numerals) and convert it to the corresponding floating point (double) data type. So for example,

>> pi_approx_double = str2num(pi_approx_str)
pi_approx_double =
    3.1416
>> pi_approx_str_too = num2str(pi_approx_double)
pi_approx_str_too =
    3.1416
>>

This can then be used to create a print statement based on a floating point value (e.g., obtained as part of our numerical calculations):

>> printstatement_too = [leader,num2str(pi_approx_double),' .']
printstatement_too =
The value is 3.1416 .
>>

In actual practice there are higher level printing functions (such as fprintf and sprintf) in Matlab built on the concepts described here. However, the above rather low-level constructs can also serve, for example in developing a title for a figure which must change as (say) the time to which the plot corresponds changes.