Data representation

Contents:

integer representation
fixed-point representation
floating-point representation (IEEE 754)
IEEE 754 standard sizes
conversion examples

Integer representation

We want to represent data in a computer. We want to store numbers and text and all these data types in a computer, but what we have is electric currents in wires. We need to represent the data with these voltage levels. If you like, we need to encode the data.

One possible basis for building up data representation is to insist that all of the wires will either have a standard amount of electricity flowing through them, or none at all. Two possibilities: On or off.

This is a very small amount of information. It is called a "bit". It's the theoretically smallest amount of information you can have.

We can store more information by using more bits. For example, if we use two bits, we can store four different possibilities:

off, off (respectively)
off, on
on, off
on, on

are the four possible states of two bits.

In general, n bits have 2ⁿ possible configurations. This is assuming we know which order they're in; in other words, possibilities 2 and 3 above have to count as different configurations.

With current computers, you frequently find bits organized into groups of eight, which are somewhat humorously called "bytes". This means there are 2⁸ possible configurations of the bits in a byte. 2⁸ is 256.

We want a representation of numbers with these bits, and we want a representation which is convenient for arithmetic. We want to be able to build arithmetic circuitry, and software, which doesn't have huge addition tables built in. We want the numbers to be in order, in some sense.

In fact there's a fairly straightforward way to represent numbers with bits. Each bit can be on or off; we can call this "0" or "1".

Then if we put a bunch of bits together, like this: 10011
we can treat that as a number in base 2 (binary).

You know that "926" in "base ten" means 9*10² + 2*10¹ + 6*10⁰

so 10011 in base two is 1*2⁴ + 0*2³ + 0*2² + 1*2¹ + 1*2⁰

Actually we can omit the zeroes, and the multiplication by one... binary is very easy. So we get
2⁴ + 2¹ + 2⁰
= 16 + 2 + 1
= 19

A byte is generally eight bits. What's the largest number we could represent in that byte?

11111111₂
= 1*2⁷ + 1*2⁶ + 1*2⁵ + 1*2⁴ + 1*2³ + 1*2² + 1*2¹ + 1*2⁰
= 255

All bits zero will be zero, of course.

So an eight-bit number can be anywhere from zero to 255, inclusive. If we want larger numbers, we have to use more bits, i.e. more than one byte.

In general, we can use n bits to represent an integer from 0 to 2ⁿ-1 inclusive, e.g. 8 bits gives us 0 to 255.

How do we get negative numbers? Two ways:

"sign-and-magnitude": one bit is the "sign bit", other bits are an "n-1"-bit number. Like how we write numbers on paper. You figure out what the number is, ignoring the sign bit; but actually that's not necessarily the number, it's just the number's magnitude (absolute value).
If the sign bit is 1, the number is negative.
If the sign bit is 0, the number is positive or zero ("non-negative").
So we can represent an integer from -2**{n-1}-1 to 2**{n-1}-1 inclusive. (because we have n-1 bits, plugged into the previous formula; and then we can put '-' or not before it.)
"two's-complement": almost always used these days (for integers). It uses n bits to represent an integer from -2^n-1 to 2^n-1-1 inclusive.

These two formats yield approximately the same number of numbers: half positive, half negative. For example, 16 bits can represent an integer from -32768 to 32767 inclusive, in the two's complement format.

We won't discuss the two's-complement representation further in this course. It's discussed in CSC 258, at least if you take it from me.

In either case, you can see that we have 2ⁿ possibilities for the numbers, and we can choose all non-negative, which gives us 0 to 2ⁿ-1; or we can choose about half each, positive and negative.

Fixed-point representation

Suppose we want numbers other than integers, e.g. 2½?

A straightforward representation for non-integer numbers involves adding something like decimal points. But "decimal" means base 10, so we tend to call it a "binary point". There's also a generic term "radix point", which applies equally well to base 2 or to base 10 or to any base.

I claim that we can represent two and a half by 10.1 in binary. That is, 2.5₁₀ = 10.1₂.

10.1₂
= 1*2¹ + 0*2⁰ + 1*2^-1
= 1*2 + 0*1 + 1*½
= 2 + 0 + ½
= 2½

This is called "fixed point". We can use the same basic format as we did for integers, so long as we agree on how many places there are after the radix point.

Floating-point representation

Floating-point numbers involve an exponent of the base, like scientific notation. Example: Avogadro's number can be written as 6.02x10²³ in base 10.

But computer floating-point numbers are usually based on base two, of course.
6.02x10²³ is approximately (1 and 63/64)x2⁷⁸
or 1.111111 (base two) x 2^{1001110 (base two)}

To represent this in a computer, in (say) 32 bits, we can use something like the IEEE floating-point standard (IEEE 754):

1 sign bit
8 bits of exponent
23 bits of "mantissa" (the significand) (NOT the base; it's base two)

If the exponent is all ones (looks like 255), the number is a special value:

if mantissa is all zeroes (0), value is is "+/- infinity" (depending on sign bit)
else value is "NaN" ("not a number")

If exponent is all zeroes, the meaning of the number is +/- 0.{mantissa} x 2^-126

For exponents other than all ones (255) and all zeroes (0), the meaning of the number is:
+/- 1.{mantissa} x 2 ^{{exponent}-127}

Why subtract 127? 8 bits represents a number from 0 to 255 -- subtract 127 to give us the ability to express exponents from -127 to 128.

Note that we save a bit by not representing the '1.'. This means that all floating-point numbers must be "normalized" -- they begin with "1", DOT.

This restriction resembles that in scientific notation in base ten, except that in base ten there are nine possible digits for that single digit to the left of the decimal point, e.g. '6' in Avogadro's number, whereas in base two there is only the one possibility.

The sole digit to the left of the radix point can't be zero (in either base 2 or base 10). If it's zero, you just slide around the radix point as needed, adjusting the exponent value to compensate, which is called "normalizing" the number.

This is why the special rule for an all-zeroes exponent is essential; otherwise we couldn't represent the number zero itself. This rule also gives us a few smaller-magnitude numbers, e.g. 0.00000000000001 x 2^-126, but its chief importance is that it gives us a way to represent zero.

(Why -126 instead of -127 (which would seem more obvious, to me anyway)? Consider trying to represent the number 2^-127 itself. With the exponent being -126, we could represent this as (in effect) 0.1*2^-126. If the actual exponent for this special case of an all-zeroes exponent field were -127, this number would be unrepresentable. But this is getting rather far afield and you don't have to know this for this course.)

IEEE 754 standard sizes

IEEE 754 has two standard sizes:

"floating-point": 32 bits: 8 for exponent, 23 for mantissa field
"double-precision floating-point": 64 bits: 11 for exponent, 52 for mantissa field (and it's 2^-2046 instead of 2^-126 above in the exponent-all-zeroes rule, and it's exponent-2047 instead of exponent-127 above in the normal case)
In C, we can usually create variables of these types by asking for "float" or "double", respectively.
For various reasons, mostly historical, we rarely use "float" in C.
Also, "fixed-point" numbers are rarely used in computers these days. So normally, we either use integers or we go all the way to floating-point numbers... in C, all the way to double-precision floating-point numbers.

Example floating-point conversions

Convert 12.4 to IEEE single-precision.

12₁₀ is 1100₂

.4 is 2/5

In base ten, we'd divide 5 into 2 in base 10 to get 0.4. You do this with long division.

For base two, we instead divide 5 into 2 in base 2, or if you like divide the base two number 101 into 10 (these being the base-two representations of 5 and 2 respectively), using long division, in base two.

This turns out to be a repeating fraction, with "0110" repeating.

so 12.4 = 1100.0110011001100110...
= 1.1000110011001100110... x 2³
so use 130 for the exponent field (because 130-127 == 3 -- see the conversion back the other way, below, if this isn't clear)

Convert 130 to base two by repeated integer division, and take the remainders. (We could have done this for 12 near the beginning of this example, except that it was so easy to see that 1100 is 8+4 is 12, but we do need to have an algorithm, so here it is.)

130 div 2 = 65 remainder 0
6 div 2 = 32 remainder 1
3 div 2 = 16 remainder 0
1 div 2 = 8 remainder 0
8 div 2 = 4 remainder 0
4 div 2 = 2 remainder 0
2 div 2 = 1 remainder 0
1 div 2 = 0 remainder 1
Read the remainders upwards to get the value.
so 130₁₀ = 10000010₂

so IEEE 754 single-precision representation is:
0 10000010 10001100110011001100110

Remember that we omit the initial '1' from the mantissa field, thus gaining an additional bit of precision.

There are no spaces there in any sense of course, I've just shown the grouping. The computer knows the grouping by counting bits.

Now let's take that value, 01000001010001100110011001100110, and interpret it as an IEEE single-precision floating-point number and see what it is.

We separate the bits into the three fields: the first bit is the sign bit, the next 8 are the exponent field, the remaining 23 are the mantissa field. So the sign bit is 0, meaning that the number is not negative. The exponent field is 10000010, or 2⁷+2², which is 128+2 or 130. And the mantissa field is 10001100110011001100110. Remember that there is an implied '1' at the beginning there, if you look at how the mantissa field substitutes into the meaning of the number, towards the beginning of this document.

Substituting into our formula, we get 1.10001100110011001100110 x 2^130-127

So you see why we used 130 for the exponent field when the actual exponent is supposed to be 3.

The mantissa, then, is
2⁰ + 2^-1 + 2^-5 + 2^-6 + 2^-9 + 2^-10 + 2^-13 + 2^-14 + 2^-17 + 2^-18 + 2^-21 + 2^-22

Add these up with your calculator and you'll get 1.5499999523162841796875.

This is times 2³, which is 8; multiply by 8 and you'll get 12.3999996185302734375.

So 01000001010001100110011001100110 isn't actually a representation of 12.4 exactly, but it's as close as we can get, just as how 0.3333333333 in base ten is not equal to a third but it is as close as we can get with ten significant decimal digits.

[list of topics covered, evening section]
[course notes available (evening section)]
[main course page]