Skip to main content
Engineering LibreTexts

2.5: Detail- ASCII

  • Page ID
    50161
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    ASCII, which stands for “The American Standard Code for Information Interchange,” was introduced by the American National Standards Institute (ANSI) in 1963. It is the most commonly used character code.

    ASCII is a seven-bit code, representing the 33 control characters and 95 printing characters (including space) in Table 2.2. The control characters are used to signal special conditions, as described in Table 2.3.

    Control Characters Digits Uppercase Lowercase
    HEX DEC CHR Ctrl HEX DEC CHR HEX DEC CHR HEX DEC CHR
    00 0 NUL ^@ 20 32 SP 40 64 @ 60 96
    01 1 SOH ^A 21 33 ! 41 65 A 61 97 a
    02 2 STX ^B 22 34 " 42 66 B 62 98 b
    03 3 ETX ^C 23 35 # 43 67 C 63 99 c
    04 4 EOT ^D 24 36 $ 44 68 D 64 100 d
    05 5 ENQ ^E 25 37 % 45 69 E 65 101 e
    06 6 ACK ^F 26 38 & 46 70 F 66 102 f
    07 7 BEL ^G 27 39 47 71 G 67 103 g
    08 8 BS ^H 28 40 ( 48 72 H 68 104 h
    09 9 HT ^I 29 41 ) 49 73 I 69 105 i
    0A 10 LF ^J 2A 42 * 4A 74 J 6A 106 j
    0B 11 VT ^K 2B 43 + 4B 75 K 6B 107 k
    0C 12 FF ^L 2C 44 , 4C 76 L 6C 108 l
    0D 13 CR ^M 2D 45 - 4D 77 M 6D 109 m
    0E 14 SO ^N 2E 46 . 4E 78 N 6E 110 n
    0F 15 SI ^O 2F 47 / 4F 79 O 6F 111 o
    10 16 DLE ^P 30 48 0 50 80 P 70 112 p
    11 17 DC1 ^Q 31 49 1 51 81 Q 71 113 q
    12 18 DC2 ^R 32 50 2 52 82 R 72 114 r
    13 19 DC3 ^S 33 51 3 53 83 S 73 115 s
    14 20 DC4 ^T 34 52 4 54 84 T 74 116 t
    15 21 NAK ^U 35 53 5 55 85 U 75 117 u
    16 22 SYN ^V 36 54 6 56 86 V 76 118 v
    17 23 ETB ^W 37 55 7 57 87 W 77 119 w
    18 24 CAN ^X 38 56 8 58 88 X 78 120 x
    19 25 EM ^Y 39 57 9 59 89 Y 79 121 y
    1A 26 SUB ^Z 3A 58 : 5A 90 Z 7A 122 z
    1B 27 ESC ^[ 3B 59 ; 5B 91 [ 7B 123 {
    1C 28 FS ^\ 3C 60 ¡ 5C 92 \ 7C 124
    1D 29 GS ^] 3D 61 = 5D 93 ] 7D 125 }
    1E 30 RS ^^ 3E 62 > 5E 94 ^ 7E 126 ~
    1F 31 US ^_ 3F 63 ? 5F 95 _ 7F 127 DEL
    Table 2.2: ASCII Character Set

    On to 8 Bits

    In an 8-bit context, ASCII characters follow a leading 0, and thus may be thought of as the “bottom half” of a larger code. The 128 characters represented by codes between HEX 80 and HEX FF (sometimes incorrectly called “high ASCII” of “extended ASCII”) have been defined differently in different contexts. On many operating systems they included the accented Western European letters and various additional

    HEX DEC CHR Ctrl Meaning
    00 0 NUL ^@ NULl blank leader on paper tape; generally ignored
    01 1 SOH ^A Start Of Heading
    02 2 STX ^B Start of TeXt
    03 3 ETX ^C End of TeXt; matches STX
    04 4 EOT ^D End Of Transmission
    05 5 ENQ ^E ENQuiry
    06 6 ACK ^F ACKnowledge; affirmative response to ENQ
    07 7 BEL ^G BELl; audible signal, a bell on early machines
    08 8 BS ^H BackSpace; nondestructive, ignored at left margin
    09 9 HT ^I Horizontal Tab
    0A 10 LF ^J Line Feed; paper up or print head down; new line on Unix
    0B 11 VT ^K Vertical Tab
    0C 12 FF ^L Form Feed; start new page
    0D 13 CR ^M Carriage Return; print head to left margin; new line on Macs
    0E 14 SO ^N Shift Out; start use of alternate character set
    0F 15 SI ^O Shift In; resume use of default character set
    10 16 DLE ^P Data Link Escape; changes meaning of next character
    11 17 DC1 ^Q Device Control 1; if flow control used, XON, OK to send
    12 18 DC2 ^R Device Control 2
    13 19 DC3 ^S Device Control 3; if flow control used, XOFF, stop sending
    14 20 DC4 ^T Device Control 4
    15 21 NAK ^U Negative AcKnowledge; response to ENQ
    16 22 SYN ^V SYNchronous idle
    17 23 ETB ^W End of Transmission Block
    18 24 CAN ^X CANcel; disregard previous block
    19 25 EM ^Y End of Medium
    1A 26 SUB ^Z SUBstitute
    1B 27 ESC ^[ ESCape; changes meaning of next character
    1C 28 FS ^\ File Separator; coarsest scale
    1D 29 GS ^] Group Separator; coarse scale
    1E 30 RS ^^ Record Separator; fine scale
    1F 31 US ^_ Unit Separator; finest scale
    20 32 SP   SPace; usuallly not considered a control character
    7F 127 DEL   DELete; orginally ignored; sometimes destructive backspace
    Table 2.3: ASCII control characters

    punctuation marks. On IBM PCs they included line-drawing characters. Macs used (and still use) a different encoding.

    Fortunately, people now appreciate the need for interoperability of computer platforms, so more universal standards are coming into favor. The most common code in use for Web pages is ISO-8859-1 (ISO-Latin) which uses the 96 codes between HEX A0 and HEX FF for various accented letters and punctuation of Western European languages, and a few other symbols. The 32 characters between HEX 80 and HEX 9F are reserved as control characters in ISO-8859-1.

    Nature abhors a vacuum. Most people don’t want 32 more control characters (indeed, of the 33 control characters in 7-bit ASCII, only about ten are regularly used in text). Consequently there has been no end of ideas for using HEX 80 to HEX 9F. The most widely used convention is Microsoft’s Windows Code Page 1252 (Latin I) which is the same as ISO-8859-1 (ISO-Latin) except that 27 of the 32 control codes are assigned to printed characters, one of which is HEX 80, the Euro currency character. Not all platforms and operating systems recognize CP-1252, so documents, and in particular Web pages, require special attention.

    Beyond 8 Bits

    To represent Asian languages, many more characters are needed. There is currently active development of appropriate standards, and it is generally felt that the total number of characters that need to be represented is less that 65,536. This is fortunate because that many different characters could be represented in 16 bits, or 2 bytes. In order to stay within this number, the written versions of some of the Chinese dialects must share symbols that look alike.

    The strongest candidate for a 2-byte standard character code today is known as Unicode.

    Reference

    There are many Web pages that give the ASCII chart, with extensions to all the world’s languages. Among the more useful:


    This page titled 2.5: Detail- ASCII is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Paul Penfield, Jr. (MIT OpenCourseWare) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.