There are convenient traits Reader
and Writer
in std::old_io
module to read/write integer values using various endiannes. But that module is declared as obsolete so I'm trying to figure out other ways to do that.
The one way is to read bytes and construct result values with bit arithmetic. Is there other way in standard library? E.g. to read u64
from &[u8]
where it's encoded in big-endian encoding. What I would do in C is to memcpy 8 bytes from a uint8_t
array to a uint64_t
value and then perform something like htons
to swap bytes if necessary.
It is very easy to convert an integer value into an array/slice, which can be used to write to a file stream, like you say above about using bit arithmetic. However, I wanted to post here so that people understand that using bit methods (like I do below and the original poster already mentioned) actually optimize to a single instruction on the X86_64 at least. This is exactly the same as doing the memcpy
operation that the original poster talks about.
For example, take a look at this code:
#[inline]
fn u16tou8ale(v: u16) -> [u8; 2] {
[
v as u8,
(v >> 8) as u8,
]
}
// little endian
#[inline]
fn u32tou8ale(v: u32) -> [u8; 4] {
[
v as u8,
(v >> 8) as u8,
(v >> 16) as u8,
(v >> 24) as u8,
]
}
// big endian
#[inline]
fn u32tou8abe(v: u32) -> [u8; 4] {
[
(v >> 24) as u8,
(v >> 16) as u8,
(v >> 8) as u8,
v as u8,
]
}
fn main() {
println!("{:?}", u32tou8ale(0x12345678));
println!("{:?}", u32tou8abe(0x12345678));
}
The function u32tou8ale
, for example, actually turns into a single instruction that the CPU executes. That single instruction creates the [u8; 4]
array on the stack, even the big-endian version u32tou8abe
is a single instruction to create the [u8; 4]
. This is possible because of the optimizer. You may say well that is because it is a constant compile time value, but if you experiment you will find that when given a u32 value that the compiler is unable to know ahead of time it will still produce the array on the stack in a single instruction essentially by doing a memory copy operation. For example:
fn main() {
unsafe {
let p: *const u32 = std::mem::transmute(main);
println!("{:?}", u32tou8ale(*p));
}
}
This reads a value from the memory location referenced by the symbol main
which is our function. The compiler can not know this value and therefore it issues a move instruction that reads the value onto the stack, and then it considers that value a [u8; 4]
.
As for portability, just simply always be explicit about what byte order you read and write the value in and everything will work out fine. For example if you use u32tou8ale
then you get little byte order no matter what architecture you target, and if you wrote the equivalent read function and you explicitly read as big byte order then you can be sure that you will read in that ordering.
I hope this helps anyone who comes here looking to convert integers into bytes and from!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With