How to convert an array of char to array of int

0

Is there a way to convert an array of char to array of int32 without iterating every member.

I need to convert a huge amount of data so I'm looking for some faster than:

char ac[1000000];
int32_t ai[1000000];
for(int i=0;i<1000000;i++) {
    ai[i]=ac[i];
}

Notes:

  • I'm not asking about how to convert '3' to 0x03, I'm talking about convert 0x03 (one byte) to 0x00000003 (to 4 bytes)
  • Not portable is not a problem (platform linux AMD64).
  • The proposed method is slow.
  • looking for a library that use SSE instructions or similar. I need it to feed a math function that works with int32 numbers and my original data is in 8 bits size, so I need to convert it, and obviously I can't cast because is an memory area not a value.
c
asked on Stack Overflow Jul 7, 2017 by Mquinteiro • edited Jul 7, 2017 by Mquinteiro

3 Answers

3

Is there a way to convert an array of char to array of int without iterate every member.

No.

Trying to explain this (I assumed it would be obvious): an int is a different size than a char, so no block copies would ever help you. In one or another way, you have to touch each element.

There might be solutions parallelizing this, e.g. by partitioning the array and using threads to handle the parts. But you would still have to convert each and every element.


Regarding your edit:

convert 0x03 (one byte) to 0x0003 (to bytes)
[...] not portable is not a problem (platform linux AMD64)

There seems to be another misconception: int on Linux x86_64 has four bytes, not two. If you really need two bytes per input value, you should use int16_t.


And yet another remark: Typical SIMD instructions (like in SSE2) won't help you either. They assume the same layout of input and output areas. As already stated, the only "optimization" I can possibly see is parallelizing. There's no way around having to touch each element.

answered on Stack Overflow Jul 7, 2017 by (unknown user) • edited Jul 7, 2017 by (unknown user)
1

Not sure if this will be faster, you have to check (also it will be depend on the sizeof(int) == 4):

// note: untested
char ac[1000000];
int ai[1000000];
memset(ai, 0, sizeof(int) * 1000000); // this should be very fast
char * d = (char *) (ai + 3); // go to last byte of the first int
for(int i=0;i<1000000;i++) {
    d += 4; // go to last byte of the next int
    *d=ac[i];
}
answered on Stack Overflow Jul 7, 2017 by Noel Lopes
1

Are you sure you need ints?

If not, you could do

char ac[1000000];
uint8_t *ai = (uint8_t*)ac;

if the reason you want them to be ints is because a function takes an int as argument and you need to pass some of the values of the array to it, then there is no problem with this method, as they will be implicitly casted to ints.

I think that converting them to ints just makes you use more memory than you actually need.

EDIT:

If you actually DO need an int array, here is a workaround that does not store more than it needs.

You can make a helper function that just extracts the char value casted to an int from the newly created int array, hence behaving like an int array.

// Endianness test to extract the char number
constexpr bool endianness() {
    return *(int*)const_cast<char*>("\x00\x01") & 1;
}

// Get char value casted to int from the int array
int getVal(int *i, int idx) {
    int iidx = idx / sizeof(int);
    int rem = idx % sizeof(int);
    if(endianness()) rem = sizeof(int) - rem - 1;
    return (i[iidx] & (0xff << 8*rem)) >> 8*rem;
}

To use this you just convert the char array to an int pointer and just use it, like so.

char ac[1000000];
int *ai = (int*)ac;
cout << getVal(ai, 0);

This will print the value of the first element casted into an int and is actually portable.

answered on Stack Overflow Jul 7, 2017 by Garmekain • edited Jul 7, 2017 by Garmekain

User contributions licensed under CC BY-SA 3.0