I can think of some nasty inefficient ways to accomplish this task, but I'm wondering what the best way is.
For example I want to copy 10 bytes starting at the 3rd bit in a byte and copy to a pointer as usual.
Is there a better way than to copy one shifted byte at a time?
Thanks
Unaligned memory accesses occur when you try to read N bytes of data starting from an address that is not evenly divisible by N (i.e. addr % N != 0). For example, reading 4 bytes of data from address 0x10004 is fine, but reading 4 bytes of data from address 0x10005 would be an unaligned memory access.
General Byte Alignment RulesStructures between 5 and 8 bytes of data should be padded so that the total structure is 8 bytes. Structures between 9 and 16 bytes of data should be padded so that the total structure is 16 bytes. Structures greater than 16 bytes should be padded to 16 byte boundary.
First we must conceptualize main memory as a contiguous block of consecutive memory locations. Each location contains a fixed number of bits. The data which these fixed number of bits represents can be accessed by the location's address. Thus addresses denote the smallest unit of memory which can be manipulated.
As a summary, if functions like memset() and memcpy() are used to access Device memory, then the pointers must be to an aligned address.
The general approach is to read the source buffer as efficiently as possible, and shift it as required on the way to writing the destination buffer.
You don't have to do byte operations, you can always get the source reads long
aligned for the bulk of the operation by doing up to three bytes at the beginning, and similarly handling the end since you shouldn't attempt to read past the stated source buffer length.
From the values read, you shift as required to get the bit alignment desired and assemble finished bytes for writing to the destination. You can also do the same optimization of writes to the widest aligned word size you can.
If you dig around in the source to a compression tool or library that makes extensive use of variable-width tokens (zlib, MPEG, TIFF, and JPEG all leap to mind) you will likely find sample code that treats an input or output buffer as a stream of bits that will have some implementation ideas to think about.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With