I have the following problem: I have a have a data structure that is parsed from a buffer and contains some references into this buffer, so the parsing function looks something like
fn parse_bar<'a>(buf: &'a [u8]) -> Bar<'a>
So far, so good. However, to avoid certain lifetime issues I'd like to put the data structure and the underlying buffer into a struct as follows:
struct BarWithBuf<'a> {bar: Bar<'a>, buf: Box<[u8]>}
// not even sure if these lifetime annotations here make sense,
// but it won't compile unless I add some lifetime to Bar
However, now I don't know how to actually construct a BarWithBuf
value.
fn make_bar_with_buf<'a>(buf: Box<[u8]>) -> BarWithBuf<'a> {
let my_bar = parse_bar(&*buf);
BarWithBuf {buf: buf, bar: my_bar}
}
doesn't work, since buf
is moved in the construction of the BarWithBuf value, but we borrowed it for parsing.
I feel like it should be possible to do something along the lines of
fn make_bar_with_buf<'a>(buf: Box<[u8]>) -> BarWithBuf<'a> {
let mut bwb = BarWithBuf {buf: buf};
bwb.bar = parse_bar(&*bwb.buf);
bwb
}
to avoid moving the buffer after parsing the Bar
, but I can't do that because the whole BarWithBuf
struct has to be initalised in one go.
Now I suspect that I could use unsafe
code to partially construct the struct, but I'd rather not do that.
What would be the best way to solve this problem? Do I need unsafe code? If I do, would it be safe do to this here? Or am I completely on the wrong track here and there is a better way to tie a data structure and its underlying buffer together?
PySpark PySpark StructType & StructField classes are used to programmatically specify the schema to the DataFrame and creating complex columns like nested struct, array and map columns. StructType is a collection of StructField’s that defines column name, column data type, boolean to specify if the field can be nullable or not and metadata.
The steps we have to follow are these: Iterate through the schema of the nested Struct and make the changes we want Create a JSON version of the root level field, in our case groups, and name it for example groups_json and drop groups Then convert the groups_json field to groups again using the modified schema we created in step 1.
So, if the field wasn’t nested we could easily just cast it to string. but since it’s nested this doesn’t work. The following command works only for root-level fields, so it could work if we wanted to convert the whole groups field, or move programs at the root level After a lot of research and many different tries.
After a lot of research and many different tries. I realized that if we want to change the type, edit, rename, add or remove a nested field we need to modify the schema. The steps we have to follow are these: Iterate through the schema of the nested Struct and make the changes we want
I think you're right in that it's not possible to do this without unsafe code. I would consider the following two options:
Change the reference in Bar
to an index. The contents of the box won't be protected by a borrow, so the index might become invalid if you're not careful. However, an index might convey the meaning of the reference in a clearer way.
Move Box<[u8]>
into Bar
, and add a function buf() -> &[u8]
to the implementation of Bar
; instead of references, store indices in Bar
. Now Bar
is the owner of the buffer, so it can control its modification and keep the indices valid (thereby avoiding the problem of option #1).
As per DK's suggestion below, store indices in BarWithBuf
(or in a helper struct BarInternal
) and add a function fn bar(&self) -> Bar
to the implementation of BarWithBuf
, which constructs a Bar
on-the-fly.
Which of these options is the most appropriate one depends on the actual problem context. I agree that some form of "member-by-member construction" of structs would be immensely helpful in Rust.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With