A quick reference regarding how to extract words from a bit stream, useful for quickly pointing out how to extract information from a bit stream.

Bit streams

When extracting a word from a bit stream, we need to know the following:

  • Bit order (MSB first / LSB first)
  • MSB / LSB position (Example: bit 3; offset 1)
  • Bit width (Example: 10 bit)
  • Which byte offset is next (offset increases / offset decreases)

All the following examples represent the four ways a 10 bit word could be stored.

MSB first, offset increases
MSB first; Offset increases.

MSB first, offset decreases
MSB first; Offset decreases.

LSB first, offset increases
LSB first; Offset increases.

LSB first, offset decreases
LSB first; Offset decreases.

Legend
Legend.

Alignment & Padding

Frequently, the words extracted from a bit stream don’t fit nicely on the standard integer types, which are commonly multiples of 8 bits. For this conversion we need to decide the following:

  • Alignment (left / right)
  • Padding (pad with ones, zeroes or sign-extend)

The most common options are right align with sign extension for signed types, and right align with zero padding for unsigned types, since these options will correctly preserve the original value, although the other options I have have seen get some use in niche cases.

In the following examples, two 10 bit signed integers, 341 (0x155 in hex) and -342 (0x2AA in hex) are translated to a 16 bit signed integer using different alignment s and padding methods, showing how the stored value may or may not change when read as a 16 bit integer.

Right alignment with sign extension
Right alignment with sign extension.

Right alignment with zero padding
Right alignment with zero padding.

Right alignment with one padding
Right alignment with one padding.

Left alignment with zero padding
Left alignment with zero padding.

Left alignment with one padding
Left alignment with one padding.

Legend
Legend.

I’ve only seen left alignment in the wild a couple of times, namely, on ADC conversions from the 10 bit ADC on PIC microcontrollers, so I consider necessary to document those odd cases too.

These tables are also available in PDF, Open Document, and as a single web page.

Why?

This page exists because somehow at work we managed to waste an hour trying to communicate among each other how words from a CAN bus frame ought to be interpreted. For future reference I can now show this page and ask people to point which of the 4 bit order options is needed.

And since padding / extension is common and also causes issues, I added those just in case.