Converting fixed point to floating point
In Part 1 of this series, I talked about three different options for converting from a fixed-point representation to another fixed-point representation with a bigger bit depth.
This happens occasionally. The simplest case is when you send a 16-bit signal to a 24-bit DAC. Another good example is when you send a 16-bit LPCM signal to a 24- or 32-bit fixed point digital signal processor.
However, these days it’s more likely that the incoming fixed-point signal (incoming signals are almost always in a fixed-point representation) is converted to floating point for signal processing. (I covered the differences between fixed- and floating-point representations in another posting.)
If you’re converting from fixed point to floating point, you divide the sample’s value by 2^(nBits-1). In other words, if you’re converting a 5-bit signal to floating point, you divide each sample’s value by 2^4, as shown below.

The reason for this is that there are 2^(nBits-1) quantisation levels for the negative portions of the signal. The positive-going portions have one fewer levels due to the two’s complement representation (the 00000 had to come from somewhere…).
So, you want the most-negative value to correspond to -1.0000 in the floating point world, and then everything else looks after itself.
Of course, this means that you will never hit +1.0. You’ll have a maximum signal level of 1 – 1/2^(nBits-1), which is very close. Close enough.
The nice thing about doing this conversation is that by entering into a floating point world, you immediately gain resolution to attenuate and headroom to increase the gain of the signal – which is exactly what we do when we start processing things.
Of course, this also means that, when you’re done processing, you’ll need to feed the signal out to a fixed-point world again (for example, to a DAC or to an S/PDIF output). That conversion is the topic of Part 4.