I have to deserialize JSON blobs where in some places the absence of an entire object is encoded as an object with the same structure but all of its fields set to default values (empty strings and zeroes).
extern crate serde_json; // 1.0.27
#[macro_use] extern crate serde_derive; // 1.0.78
extern crate serde; // 1.0.78
#[derive(Debug, Deserialize)]
struct Test<T> {
text: T,
number: i32,
}
#[derive(Debug, Deserialize)]
struct Outer {
test: Option<Test<String>>,
}
#[derive(Debug, Deserialize)]
enum Foo { Bar, Baz }
#[derive(Debug, Deserialize)]
struct Outer2 {
test: Option<Test<Foo>>,
}
fn main() {
println!("{:?}", serde_json::from_str::<Outer>(r#"{ "test": { "text": "abc", "number": 42 } }"#).unwrap());
// good: Outer { test: Some(Test { text: "abc", number: 42 }) }
println!("{:?}", serde_json::from_str::<Outer>(r#"{ "test": null }"#).unwrap());
// good: Outer { test: None }
println!("{:?}", serde_json::from_str::<Outer>(r#"{ "test": { "text": "", "number": 0 } }"#).unwrap());
// bad: Outer { test: Some(Test { text: "", number: 0 }) }
// should be: Outer { test: None }
println!("{:?}", serde_json::from_str::<Outer2>(r#"{ "test": { "text": "Bar", "number": 42 } }"#).unwrap());
// good: Outer2 { test: Some(Test { text: Bar, number: 42 }) }
println!("{:?}", serde_json::from_str::<Outer2>(r#"{ "test": { "text": "", "number": 0 } }"#).unwrap());
// bad: error
// should be: Outer { test: None }
}
I would handle this after deserialization but as you can see this approach is not possible for enum values: no variant matches the empty string so the deserialization fails entirely.
How can I teach this to serde?
There are two things that need to be solved here: replacing Some(value)
with None
if value
is all defaults, and handling the empty string case for Foo
.
The first thing is easy. The Deserialize
implementation for Option
unconditionally deserializes it as Some
if the input field isn't None
, so you need to create a custom Deserialize
implementation that replaces Some(value)
with None
if the value
is equal to some sentinel, like the default (this is the answer proposed by Issac, but implemented correctly here):
fn none_if_all_default<'de, T, D>(deserializer: D) -> Result<Option<T>, D::Error>
where
T: Deserialize<'de> + Default + Eq,
D: Deserializer<'de>,
{
Option::deserialize(deserializer).map(|opt| match opt {
Some(value) if value == T::default() => None,
opt => opt,
})
}
#[derive(Deserialize)]
struct Outer<T: Eq + Default> {
#[serde(deserialize_with = "none_if_all_default")]
#[serde(bound(deserialize = "T: Deserialize<'de>"))]
test: Option<Test<T>>,
}
This solves the first half of your problem, with Option<Test<String>>
. This will work for any deserializable type that is Eq + Default
.
The enum
case is much more tricky; the problem you're faced with is that Foo
simply won't deserialize from a string other than "Bar"
or "Baz"
. I don't really see a good solution for this other than adding a third "dead" variant to the enum:
#[derive(PartialEq, Eq, Deserialize)]
enum Foo {
Bar,
Baz,
#[serde(rename = "")]
Absent,
}
impl Default for Foo { fn default() -> Self { Self::Absent } }
The reason this problem exists from a data-modeling point of view is that it has to account for the possibility that you'll get json like this:
{ "test": { "text": "", "number": 42 } }
In this case, clearly Outer { test: None }
is not the correct result, but it still needs a value to store in Foo
, or else return a deserialization error.
If you want it to be the case that ""
is valid text only if number
is 0
, you could do something significantly more elaborate and probably overkill for your needs, compared to just using Absent
. You'd need to use an untagged enum, which can store either a "valid" Test
or an "all empty" Test
, and then create a version of your struct that only deserializes default values:
struct MustBeDefault<T> {
marker: PhantomData<T>
}
impl<'de, T> Deserialize<'de> for MustBeDefault<T>
where
T: Deserialize<'de> + Eq + Default
{
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>
{
match T::deserialize(deserializer)? == T::default() {
true => Ok(MustBeDefault { marker: PhantomData }),
false => Err(D::Error::custom("value must be default"))
}
}
}
// All fields need to be generic in order to use this solution.
// Like I said, this is radically overkill.
#[derive(Deserialize)]
struct Test<T, U> {
text: T,
number: U,
}
#[derive(Deserialize)]
#[serde(untagged)]
enum MaybeDefaultedTest<T> {
AllDefault(Test<EmptyString, MustBeDefault<i32>>),
Normal(Test<Foo, i32>),
}
// `EmptyString` is a type that only deserializes from empty strings;
// its implementation is left as an exercise to the reader.
// You'll also need to convert from MaybeDefaultedTest<T> to Option<T>;
// this is also left as an exercise to the reader.
It is now possible to write MaybeDefaulted<Foo>
, which will deserialize from things like {"text": "", "number": 0}
or {"text": "Baz", "number": 10}
or {"text": "Baz", "number": 0}
, but will fail to deserialize from {"text": "", "number": 10}
.
Again, for the third time, this solution is probably radically overkill (especially if your real-world use case involves more than 2 fields in the Test
struct), and so unless you have very intense data modeling requirements, you should go with adding an Absent
variant to Foo
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With