Overview: the different kinds of data stored in a computer
 

 

We will study how characters in text are represented inside a computer next

Prelude to "Representing (English) text"

Remember the importance of context:

  • Information = code (= data) + context
 

 

The context of is given (provided) by the data type:

 int x;    // Tells the compiler to use 2s compl code
 float y;  // Tells the compiler to use IEEE 754 code   

 char c;   // Tells the compiler to use ASCII code
   

The variable will contain (= store) the code (= data).

This code (= data) is interpreted using the data type (context) to obtain the meaning (= information)

The ASCII code
 

Computers today uses the ASCII code to represent English text:

  • ASCII = American Standard Code for Information Interchange     

  • ASCII is an international standard code to represent English text

  • There are 128 symbols represented in the ASCII code

The ASCII code table (taken from Wikipedia) - click to pull out

Notice: the number digits (0,1,2,3,4,5,6,7,8,9) have successive code values

Notice: code(A) < code(B) < ... < code(Z) < code(a) < code(b) < ... < code(z)

Application of the fact that code(A) < code(B) < ... < code(Z) < code(a) < code(b) < ... < code(z)

The string compare method compareTo( ) of Java (and any programming language) uses the fact that code(A) < code(B) < ... < code(Z) < code(a) < code(b) < ... < code(z)

When you compare 2 strings, the computer program will compare (= subtract) the character codes individually:

 "ABC".compareTo("ABXY"):

   (1) compares (subtract) code(A) in "ABC" with code(A) in "ABXY"
   (2) compares (subtract) code(B) in "ABC" with code(B) in "ABXY"
   (3) compares (subtract) code(C) in "ABC" with code(X) in "ABXY"
   and so on, until one of <, == or > relationship is determined
  

(1) because code(A) in "ABC" == code(A) in "ABXY", we continue with the comparison

(2) because code(B) in "ABC" == code(B) in "ABXY", we continue with the comparison

(3) because code(C) in "ABC" < code(X) in "ABXY", we determine that: "ABC" < "ABXY"

Interactive DEMO of ASCII code
 

A tool by Mathias I found on the web:

  • Enter a symbol:
       

  • Binary ASCII code:
       

Type in: 012, ABC, abc
What is: code(2)-code(0) ? Is code(A) < code(B) ?)
Perform binary arithmetic on the codes to find the answer

Calculating with ASCII codes as binary numbers
 

Important fact that you must keep remembering (especially in CS255):

  • Everything stored inside the computer (memory) are binary numbers !

    Therefore:

      • Characters are (also) stored (represented) as (binary) numbers (using the ASCII code)

  • The computer can perform computations on binary numbers !

 

Important consequence:   a program can perform computations (= calculate) with the ASCII code as it they are "ordinary" (integer) binary numbers !!!

Java program that demonstrates that characters are stored as (binary) numbers
 

You can calculate (e.g.: + 1) using the ASCII codes (= binary number) store in char type variables:

 char c = 'A';

 c = (char) (c + 1);  // type of c+1 is int     
                      // So we need casting
	              // to assign back to c

 System.out.println(c); // What will this statement print ?
   
 

Question:   what character is in the variable c after the increment ?

 

The DEMO program is here: /home/cs255001/demo/ASCII-code/ascii.java