Floating point decimal representation

Floating point decimal representation (for numbers):

 -314.159  = -314.159 × 100 ≡ (-314.159, 0)
           = -31.4159 × 101 ≡ (-31.4159, 1)
	   = -3.14159 × 102 ≡ (-3.14159, 2)    
                               ^^^^^^^  ^^^
                              Mantissa  Exponent  
   

Floating point decimal reprsentation consists of 2 decimal numbers

  1. A fixed point decimal number representing the mantissa

  2. A integer (whole) decimal number representing the exponent

    The base for the exponent is 10(10)

The canonical form of the floating point decimal representation
 

The canonical form:

  • The floating point representation (mantissa, exponent) where the absolute value of the mantissa is in the range [1(10), 10(10))

Example:

 -314.159  = -314.159 × 100 ≡ (-314.159, 0)
           = -31.4159 × 101 ≡ (-31.4159, 1)
	   = -3.14159 × 102 ≡ (-3.14159, 2)    

  Canonical form: (-3.14159, 2)
   

Floating point binary representation

Floating point binary representation (for numbers):

 -101.01011  = -101.01011 × 100 ≡ (-101.01011, 0)
             = -10.101011 × 101 ≡ (-10.101011, 1)
	     = -1.0101011 × 1010 ≡ (-1.0101011, 10)    
                                    ^^^^^^^^   ^^^
                                    Mantissa   Exponent  
   

Floating point binary reprsentation consists of 2 binary numbers

  1. A fixed point binary number representing the mantissa

  2. A integer (whole) binary number representing the exponent

    The base for the exponent is 10(2)   (= 2(10) !!!)

The canonical form of the floating point binary representation
 

The canonical form:

  • The floating point representation (mantissa, exponent) where the absolute value of the mantissa is in the range [1(2), 10(2))

Example:

 -101.01011  = -101.01011 × 100 ≡ (-101.01011, 0)
             = -10.101011 × 101 ≡ (-10.101011, 1)
	     = -1.0101011 × 1010 ≡ (-1.0101011, 10)


  Canonical form: (-1.0101011, 10)
   

A note on the canonical form of the floating point binary representation

Notice that:

  • The mantissa of canonical form of a floating point binary representation always starts 1.:

        Mantissa:   1.xxxxxxxxx       
      

Examples:

      1.000101
     -1.101111                 
   

(The only exception is the value 0.0, which will be represented by a special code)

The IEEE 754 standard for floating point binary representation
 

What is the IEEE 754 standard

  • The IEEE 754 standard is an international standard that specify how the mantissa and the exponent of a floating point binary representation must be stored

  • Today's computers store floating point binary representation according to this standard

  • Wikipedia page: click here

The IEEE 754 format of the single-precision floating point binary repr
 

Codes used for the mantissa and exponent:

    Mantissa:   sign-magnitude binary code
    Exponent:   excess 127 binary code
   

Storage format (uses 32 bits or 4 bytes):

         SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM      
  Bit:   01      89                     31

  S = sign of the mantissa (0 = pos, 1 = neg)
  M = mantissa without the leading 1 (23 bits)
  E = exponent (8 bits)
   

Note: the leading 1. in the mantissa is assumed (and omitted) !!!

Example of a IEEE 754 representation (and how to decode it)

Suppose you are given the following IEEE 754 (single precision) representation:

  • 01000000101000000000000000000000       

We can find the decimal representation for this IEEE 754 code as follows:

      01000000101000000000000000000000
  =   0 10000001 01000000000000000000000
      ^ ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
      | exponent  mantissa (with leading "1." omitted)
     sign (= positive mantissa)


Mantissa = +1.01000000000000000000000 Exponent = 10000001 (= 129 = 127 + 2) = +2(10) => Floating point binary repr = +1.01 × 1010 (binary) = +101.0 (binary) => Decimale repr = 5.0(10)

Demo IEEE 754 (single precision) representation
 

Demo:

  • /home/cs255001/demo/asm/1-directives/float.s

    (Use EGTAPI)

 

Notice: you write a float number in decimal notation in the source program file.
The assembler translates it into the IEEE 754 representation and stores the binary representation in memory

How does a compiler encodes a decimal representation into IEEE 754 ?

Suppose you are given the decimal number −5.25

How to find the IEEE 754 representation:

 1. Encode -5.25 into fixed point binary representation

      -5    -->  101
      0.25  -->  0.01

      -5.25(10) = -101.01(2)

 2. Find the canonical form:  -1.0101  exponent  10(2) 

 3. Code the exponent into Excess 127:  

        01111111 + 10 = 10000001

 4. Put the different parts in their places:

       SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM
       11000000101010000000000000000000