how to binary fan out

2

I want to fan out binary data like in the following examples:

8 bit to 16 bit: 
     0 0 0 1   0 0 0 1  (0x11)    becomes 
    0000 0011 0000 0011 (0x0303)

8 bit to 32 bit: 
       0    1    0    0    0    1    1    0 (0x46)       becomes 
    0000 1111 0000 0000 0000 1111 1111 0000 (0x0F000FF0)

16 to 32 bit: 
     0 0 0 1   0 0 1 1   0 1 1 1   1 1 1 1  (0x137F)     becomes 
    0000 0011 0000 1111 0011 1111 1111 1111 (0x030F3FFF)

Since this question is tagged with bit-manipulation, I am looking for a compact solution without any loops or additional variables needed. The code should also be platform independent, so inline assembly is not what I want either.

This may be a duplicate question, but I have no idea how to search for it. Most certainly there is a proper name for what I want to do, but I just don't know it.

EDIT: To clarify some of your comments, here is a list of methods the solution should not contain:

  • assembly code (platform dependent)
  • 'complex' computation like loops (that's why I tagged this question with bit-fiddling)
  • additional runtime variables (e.g. helper constants, look up tables, ...)

Since you asked for it, here are some further details on my use case: I'm working on embedded platforms (microcontrollers) and want to mask registers. This specific case is about IO registers, where I have 16 pads for each port and several port registers to configure the pads. Some of these configuration registers use 2 bits per pad, some 4 bits. So if I want to mask individual pads when accessing the registers I need to mask those registers accordingly.

Let's say I have two configuration registers and want to access pads 4 and 8:

uint32_t configreg_2perpad;       // the first configuration register
uint32_t configreg_4perpad_low;   // lower half of the second configuration register
uint32_t configreg_4perpad_high;  // higher half of the second configuration register

uint8_t pad4 = 3;                 // we start counting at 0, of course
uint8_t pad8 = 7;
uint16_t pad_mask = (1 << pad4) |
                    (1 << pad8);  // this is now 0x0088

So what I am looking for now is a bit-wise manipulation of pad_mask so I get according masks for the registers, which would be

uint32_t configreg_2perpad_mask = 0x0000C0C0;
uint32_t configreg_4perpad_low_mask = 0xF000F000;
uint32_t configreg_4perpad_high_mask = 0x00000000;

Obviously, since memory (both RAM and flash) is a very scarce resource on µCs, I would prefer a solution without the need for further static variables like look up tables (waste of RAM) or static functions (waste of flash).

c
bit-manipulation
asked on Stack Overflow Sep 13, 2018 by Thargon • edited Sep 14, 2018 by Thargon

2 Answers

2

Fanning 8 bits to 16 can be done in a few lines of code:

uint16_t n = 0x11;
n = (n | (n << 4)) & 0x0f0f;
n = (n | (n << 2)) & 0x3333;
n = (n | (n << 1)) & 0x5555;
n = (n | (n << 1));
printf("0x%04x\n", n);  // prints 0x0303

Here's how it works. Start with the 8 bits in a 16-bit variable:

0000 0000 abcd efgh     // letters a to h represent the bits of the 8-bit number

Shift left by 4 bits and OR: n = n | (n << 4)

0000 abcd ???? efgh     // bits with a '?' are garbage we don't want

Mask off the garbage: n = n & 0x0f0f

0000 abcd 0000 efgh

Shift left by 2 bits and OR: n = n | (n << 2)

00ab ??cd 00ef ??gh

Mask off the garbage: n = n & 0x3333

00ab 00cd 00ef 00gh

Shift left by 1 bit and OR: n = n | (n << 1)

0a?b 0c?d 0e?f 0g?h

Mask off the garbage: n = n & 0x5555

0a0b 0c0d 0e0f 0g0h

Now all that's left is to duplicate the bits: n = n | (n << 1)

aabb ccdd eeff gghh

This can also be done with a lookup table. The table itself is exactly 16 bytes. Declaring it as static const allows the compiler to put the table in ROM. The savings in code size (fewer shifts and masking operations) typically makes up for the ROM used by the table. And the lookup method should be faster (2 lookups, 2 shifts, 1 mask, 1 OR) instead of (4 shifts, 3 masks, 4 ORs)

static const uint8_t table[16] = {
    0x00, 0x03, 0x0c, 0x0f, 0x30, 0x33, 0x3c, 0x3f,
    0xc0, 0xc3, 0xcc, 0xcf, 0xf0, 0xf3, 0xfc, 0xff
};

uint16_t n = 0x11;
n = (table[n >> 4] << 8) | table[n & 0xf];
printf("0x%04x\n", n);  // prints 0x0303

Fanning from 8 bits to 32 is about the same as 8-to-16, but it requires more duplication at the end:

uint32_t n = 0x46;
n = (n | (n << 12)) & 0x000f000f;
n = (n | (n <<  6)) & 0x03030303;
n = (n | (n <<  3)) & 0x11111111;
n = (n | (n << 1));
n = (n | (n << 2));
printf("0x%08x\n", n);  // prints 0x0f000ff0

At alternative method of duplicating the bits (suggested by @wim) in the comments is to replace

n = (n | (n << 1));
n = (n | (n << 2));

with

n = (n << 4) - n;

Fanning from 16 bits to 32 requires an extra shift-and-mask step:

uint32_t n = 0x137f;
n = (n | (n << 8)) & 0x00ff00ff;
n = (n | (n << 4)) & 0x0f0f0f0f;
n = (n | (n << 2)) & 0x33333333;
n = (n | (n << 1)) & 0x55555555;
n = (n | (n << 1));
printf("0x%08x\n", n);  // prints 0x030f3fff
answered on Stack Overflow Sep 14, 2018 by user3386109 • edited Sep 14, 2018 by user3386109
1

Assuming 8-bits per byte. 2's complement arch.

This is what I started with:

uint32_t Eight_To_Sixteen(const uint8_t source)
{

    uint32_t result = (source & 1);
    result |= ((source & 2) << 1);        
    result |= ((source & 4) << 2);
    result |= ((source & 8) << 3);
    result |= ((source & 0x10) << 4);
    result |= ((source & 0x20) << 5);
    result |= ((source & 0x40) << 6);
    result |= ((source & 0x80) << 7);
    return (result | (result << 1));
}

The above isn't bad. Notice how it folds the result over with itself to duplicate all the bits.

And then I started to compare with a 2-bit table lookup approach:

uint32_t Eight_To_Sixteen(const uint8_t source)
{
    static const int table[] = { 0, 3, 0xc, 0xf };
    uint32_t result = table[(source & 0x03)];
    result |= (table[(source & 0x0c)>>2]) << 4;
    result |= (table[(source & 0x30) >> 4]) << 8;
    result |= (table[(source & 0xc0) >> 6]) << 12;
    return result;
}

Still benchmarking, but the latter is generating far less code. You could expand it to have a larger lookup table and support wider inputs.

answered on Stack Overflow Sep 13, 2018 by selbie • edited Sep 14, 2018 by selbie

User contributions licensed under CC BY-SA 3.0