# Bits, Bytes, and Beyond

## 1 The Binary Point

In decimal, a number is formulated by multiplying each character by a power of 10. We know which number is multiplied by ${\displaystyle 10^{0}}$ because it lies immediately to the left of the decimal point, for example:

{\displaystyle {\begin{aligned}462.15&=4\times 10^{2}+6\times 10^{1}+2\times 10^{0}+1\times 10^{-1}+5\times 10^{-2}\\&=400+60+2+1{\frac {1}{10}}+{\frac {5}{100}}\\&=462.15\\\end{aligned}}\,\!}

In binary, the number multiplied by ${\displaystyle 2^{0}}$ lies immediately to the left of the binary point:

{\displaystyle {\begin{aligned}101.01&=1\times 2^{2}+0\times 2^{1}+1\times 2^{0}+0\times 2^{-1}+1\times 2^{-2}\\&=4+0+1+{\frac {0}{2}}+{\frac {1}{4}}\\&=5.25({\rm {in\ decimal}})\\\end{aligned}}\,\!}

When programming for FPGAs, we often need to know the specifics of a number, namely how many bits it is made up of and where the binary point lies. Simulink displays this information using the format n_m, where n is the bit width of the number and m is how many bits lie to the right of the binary point. For example, 101.01 would be a 5_2 number. A data type is called fixed-point when the binary point is fixed in one location.

## 2 Unsigned vs. Signed Representations

In mathematics, we represent a negative number by adding a ’-’ character to the front of it. A computer deals only with 0’s and 1’s, so it must use another method. One solution is to simply use the most-significant bit to represent the sign, a method known as sign-magnitude. This method has problems: the number zero is represented twice (+0 and -0 have different representations), and adders and subtracters require extra circuitry to handle negative numbers.

A better solution is a method known as two’s complement. This is what most computers use. To negate a number in this system, you invert the number bitwise (change 1’s to 0’s and vice versa) and then add 1 to result. The result of these rules is that the "higher half" of the unsigned numbers wrap around and become the negative numbers. With 3 bits, we have:

 000 001 010 011 100 101 110 111 unsigned 0 1 2 3 4 5 6 7 sign-magnitude 0 1 2 3 -0 -1 -2 -3 two’s complement 0 1 2 3 -4 -3 -2 -1

Note that the two’s complement system only represents zero once. What’s more, adding and subtracting two’s complement numbers the normal way (as if they were unsigned) produces the correct result. In Simulink, only unsigned and two’s complement numbers are supported.