Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use Serde to deserialize structs with references from a reader?

I have these structs:

#[derive(Debug, Serialize, Deserialize)]
pub struct GGConf<'a> {
    #[serde(alias = "ssh")]
    #[serde(rename = "ssh")]
    #[serde(default)]
    #[serde(borrow)]
    pub ssh_config: Option<SSHConfig<'a>>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct SSHConfig<'a> {
    #[serde(alias = "privateKey")]
    #[serde(rename = "privateKey")]
    private_key: &'a str,

    username: &'a str,
}

Deserialization happens when I read from a YAML file:

let mut config: GGConf = serde_yaml::from_reader(file)?;

On compiling, I get an error:

error: implementation of `conf::_IMPL_DESERIALIZE_FOR_GGConf::_serde::Deserialize` is not general enough
   --> src/conf.rs:50:34
    |
50  |           let mut config: GGConf = serde_yaml::from_reader(file)?;
    |                                    ^^^^^^^^^^^^^^^^^^^^^^^ implementation of `conf::_IMPL_DESERIALIZE_FOR_GGConf::_serde::Deserialize` is not general enough
    |
   ::: /home/ninan/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.98/src/de/mod.rs:524:1
    |
524 | / pub trait Deserialize<'de>: Sized {
525 | |     /// Deserialize this value from the given Serde deserializer.
526 | |     ///
527 | |     /// See the [Implementing `Deserialize`][impl-deserialize] section of the
...   |
562 | |     }
563 | | }
    | |_- trait `conf::_IMPL_DESERIALIZE_FOR_GGConf::_serde::Deserialize` defined here
    |
    = note: `conf::GGConf<'_>` must implement `conf::_IMPL_DESERIALIZE_FOR_GGConf::_serde::Deserialize<'0>`, for any lifetime `'0`...
    = note: ...but `conf::GGConf<'_>` actually implements `conf::_IMPL_DESERIALIZE_FOR_GGConf::_serde::Deserialize<'1>`, for some specific lifetime `'1`

I vaguely understand that serde deserialization also has a lifetime 'de and that the compiler is confusing my lifetime specified for it? Please correct me if I'm wrong.

How do I currently correctly deserialize the YAML into both structs? Is there something I am missing here or misunderstood?

I looked at How do I resolve "implementation of serde::Deserialize is not general enough" with actix-web's Json type?, but I cannot use an owned type. I need it to be a borrowed type.

I will try and write a playground example for this.

like image 945
leoOrion Avatar asked Mar 22 '20 15:03

leoOrion


2 Answers

This is impossible; you must use owned data instead of references.

Here's a minimal example:

use serde::Deserialize; // 1.0.104

#[derive(Debug, Deserialize)]
pub struct SshConfig<'a> {
    username: &'a str,
}

fn example(file: impl std::io::Read) {
    serde_yaml::from_reader::<_, SshConfig>(file);
}
error: implementation of `_IMPL_DESERIALIZE_FOR_SshConfig::_serde::Deserialize` is not general enough
   --> src/lib.rs:9:5
    |
9   |       serde_yaml::from_reader::<_, SshConfig>(file);
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ implementation of `_IMPL_DESERIALIZE_FOR_SshConfig::_serde::Deserialize` is not general enough
    | 
   ::: /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.104/src/de/mod.rs:531:1
    |
531 | / pub trait Deserialize<'de>: Sized {
532 | |     /// Deserialize this value from the given Serde deserializer.
533 | |     ///
534 | |     /// See the [Implementing `Deserialize`][impl-deserialize] section of the
...   |
569 | |     }
570 | | }
    | |_- trait `_IMPL_DESERIALIZE_FOR_SshConfig::_serde::Deserialize` defined here
    |
    = note: `SshConfig<'_>` must implement `_IMPL_DESERIALIZE_FOR_SshConfig::_serde::Deserialize<'0>`, for any lifetime `'0`...
    = note: ...but `SshConfig<'_>` actually implements `_IMPL_DESERIALIZE_FOR_SshConfig::_serde::Deserialize<'1>`, for some specific lifetime `'1`

If you look at the definition of serde_yaml::from_reader, you'll see that it's limited to only deserializing owned data:

pub fn from_reader<R, T>(rdr: R) -> Result<T>
where
    R: Read,
    T: DeserializeOwned,
//     ^^^^^^^^^^^^^^^^ 

The same is true for serde_json::from_reader and probably any equivalent function.

You can only deserialize a type containing references when there's data to reference. Something implementing the Read trait only guarantees that it can copy some bytes into a user-provided buffer. Since the from_reader function doesn't accept that buffer as an argument, any buffer would be destroyed at the exit of from_reader, invalidating the references.

See also:

  • How do I resolve "implementation of serde::Deserialize is not general enough" with actix-web's Json type?

If you must use references (and in many cases this isn't true), you will need to:

  1. read from the reader yourself into a buffer
  2. use from_str instead of from_reader
  3. keep the buffer around as long as the deserialized data
like image 83
Shepmaster Avatar answered Oct 22 '22 18:10

Shepmaster


from_reader takes a stream of data from somewhere (anywhere which implements the Read trait) - it doesn't store the data, meaning nothing owns the data, therefore you cannot have a reference to that data in your struct. In other words, from_reader takes a transient stream of data, therefore needs a place to store the data.

An additional complication is that serde_yaml (at least for version 0.8.11) doesn't not support zero-copy deserialization:

https://docs.rs/serde_yaml/0.8.11/serde_yaml/fn.from_str.html

pub fn from_str<T>(s: &str) -> Result<T> where
    T: DeserializeOwned,

...

YAML currently does not support zero-copy deserialization.

Compare this to, say, serde_json, which does:

https://docs.rs/serde_json/1.0.50/serde_json/de/fn.from_str.html

pub fn from_str<'a, T>(s: &'a str) -> Result<T> where
    T: Deserialize<'a>,

So, at least with something like serde_json you could use from_str from an owned buffer, and that will allow you to use references in your struct (but this will not work for serde_yaml currently)

// Written with rustc 1.42.0 and
// [dependencies]
// serde = "1.0.105"
// serde_derive = "1.0.105"
// serde_json = "1.0.50"

use std::io::Read;
use serde_derive::Deserialize;

#[derive(Debug, Deserialize)]
pub struct SshConfig<'a> {
    username: &'a str,
}

fn main() {
    // Open file handle
    let mut file = std::fs::File::open("example.json").unwrap();

    // Read the data into a String, which stores (and thus owns) the data
    let mut strbuf = String::new();
    file.read_to_string(&mut strbuf).unwrap();

    // Deserialize into struct, which references
    let result: SshConfig = serde_json::from_str(&strbuf).unwrap();
    println!("{:?}", result.username);

    // Note that `result` is only valid as long as `strbuf` exists.
    // i.e if `strbuf` goes out of scope or is moved to another function, we get an error. For example, the following would cause an error:
    // std::mem::drop(strbuf); // Function which moves strbuf, not a referernce
    // println!("{:?}", result.username); // Error
}

Depending on exactly what your concerns are, this might be less efficient than storing a String in your struct (e.g if the example.json is 1MB large, and you only extract a single field - the above code will store the entire 1MB string in memory only to have a few bytes worth of text accessible).

like image 37
dbr Avatar answered Oct 22 '22 20:10

dbr