Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the idiomatic way to encode an iterator with serde_json?

I'm trying to drain() a vec in Rust and encode the results as a JSON string. What's the best, idiomatic way to do this?

#![feature(custom_derive, plugin)]
#![plugin(serde_macros)]

extern crate serde;
extern crate serde_json;

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

impl Point {
    pub fn new(x: i32, y: i32) -> Point {
        Point {
            x: x,
            y: y
        }
    }
}

fn main() {
    let mut points = vec![Point::new(1,2), Point::new(-2,-1), Point::new(0, 0)];
    let mut drain = points.drain(..);

    println!("{}", serde_json::to_string(&drain).unwrap());
}
like image 814
mikeycgto Avatar asked Dec 21 '15 15:12

mikeycgto


1 Answers

Draining iterators are an interesting beast. They allow you to chunk out a part of a collection, taking ownership of some but not necessarily all of the items in the collection. They also allow you to do this in a reasonably efficient manner. For example, a vector could move the trailing data en masse with a single memcpy.

However, serde doesn't natively support serializing iterators (for a good reason, keep reading). You can look at the the Serialize trait to see the types of things it supports.

You'd have to implement this yourself:

use serde::{Deserialize, Serialize}; // 1.0.101
use std::{cell::RefCell, vec};

struct DrainIteratorAdapter<'a, T>(RefCell<vec::Drain<'a, T>>);

impl<'a, T: 'a> serde::Serialize for DrainIteratorAdapter<'a, T>
where
    T: serde::Serialize,
{
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        serializer.collect_seq(self.0.borrow_mut().by_ref())
    }
}

fn main() {
    let mut points = vec![Point::new(1, 2), Point::new(-2, -1), Point::new(0, 0)];
    let adapter = DrainIteratorAdapter(RefCell::new(points.drain(..)));

    println!("{}", serde_json::to_string(&adapter).unwrap());
}

The core hard part is that serialization is supposed to not have any side-effects. This is a very reasonable decision. However, whenever you call next on an iterator, you have to mutate it in order to update the state. To combine these two mismatched concepts, we have to use something like a RefCell.

Beyond that, it's just a matter of implementing the serde::Serialize trait. Since we own neither serde::Serialize or vec::Drain, we have to create a newtype to place the implementation on.

We can generalize this solution to apply to any iterator. This happens to make it read a bit nicer, in my opinion:

use serde::{Deserialize, Serialize}; // 1.0.101
use std::cell::RefCell;

struct IteratorAdapter<I>(RefCell<I>);

impl<I> IteratorAdapter<I> {
    fn new(iterator: I) -> Self {
        Self(RefCell::new(iterator))
    }
}

impl<I> serde::Serialize for IteratorAdapter<I>
where
    I: Iterator,
    I::Item: serde::Serialize,
{
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        serializer.collect_seq(self.0.borrow_mut().by_ref())
    }
}

What's the downside to this solution? Serializing the same value twice has different results! If we simply serialize and print the value twice, we get:

[{"x":1,"y":2},{"x":-2,"y":-1},{"x":0,"y":0}]
[]

This is because iterators are transient beasts - once they have read one value, it's gone! This is a nice trap waiting for you to fall into it.


In your example, none of this really makes sense. You have access to the entire Vec, so you might as well serialize it (or a slice of it) at that point. Additionally, there's no reason (right now) to drain the entire collection. That would be equivalent to just calling into_iter.

like image 102
Shepmaster Avatar answered Oct 20 '22 01:10

Shepmaster