I've been given the task of polishing the interface of a codec library. We're using C++17, and I can only use the standard library (i.e. no Boost). Currently, there's a Decoder
class that looks roughly like this:
class Decoder : public Codec {
public:
struct Result {
vector<uint8_t>::const_iterator new_buffer_begin;
optional<Metadata> metadata;
optional<Packet> packet;
};
Result decode(vector<uint8_t>::const_iterator buffer_begin,
vector<uint8_t>::const_iterator buffer_end);
private:
// irrelevant details
};
The caller instantiates a Decoder
, then feeds a stream of data to the decoder by
Reading a chunk of data from a file (but there could be other sources in the future), and appending it to a vector<uint8_t>
.
Calling the decode
function, passing the iterators for their vector.
If the returned Result
's new_buffer_begin
is identical to the buffer_begin
that was passed to decode
, that means there wasn't enough data in the buffer to decode anything, and the caller should go back to step 1. Otherwise, the caller consumes the Metadata
or Packet
object that was decoded, and goes back to step 2, using new_buffer_begin
for the next pass.
The things I dislike about this interface and need help improving:
Using vector<uint8_t>::const_iterator
seems overly specific. Is there a more generic approach that doesn't force the caller to use vector
? I was considering just using C-style interface; a uint8_t *
and a length. Is there a C++ alternative that's fairly generic?
If there was enough data to decode something, only metadata
or packet
will have a value. I think std::variant
or 2 callbacks (one for each type) would make this code more self-documenting. I'm not sure which is more idiomatic though. What are the pros and cons of each, and is there an even better approach?
I agree that mandating vector
is inappropriate, and applaud your attempts to make the interface more useful.
If decode
expects a contiguous sequence of uint8_t
, the tried-and-tested (and most flexible) solution is just to take a const uint8_t*
and a std::size_t
(or alternatively two pointers, but pointer and length is more idiomatic).
From C++20 you can do this with one argument of type std::span<const uint8_t>
. Or going back to pointers, if you really want to use modern library tools for the sake of it, you can confuse people with std::experimental::observer_ptr
.
You may also consider making decode
a template that accepts any iterator pair, and (if contiguity is needed) mandates, even if only by documentation, that the iterators reflect a contiguous sequence. But making everything a template isn't always what you want, and it isn't always useful.
In addition to @Justin's valid suggestion of spans:
std::byte
instead of uint8_t
, so:
Result decode(std::span<const std::byte> buffer);
or if you're in C++17, use the span implementation from the C++ Guidelines Support library:
#include <gsl/span>
// etc.
Result decode(gsl::span<const std::byte> buffer);
If you want to support decoding from containers other than raw memory, use arbitrary iterators (in C++17 and earlier) or possibly ranges (in C++20). The iterator version:
template <typename InputIt>
Result decode(InputIt start, InputIt end) { /* etc. */ }
It's fishy that a Decoder
inherits from a Codec
rather than the other way around.
std::variant
to express the fact you have either a Packet or Metadata; you could also "combine" alternatives if instead of callbacks you use variants' std::visit
.C++20 will have std::span
, which does what you want:
Result decode(std::span<uint8_t const> buffer);
std::span<T>
is semantically equivalent to a T* buffer, size_t size
.
In C++17, there are some implementations of a span
type which are equivalent to std::span
, such as the GSL's gsl::span
. See What is a "span" and when should I use one? .
If you can't use any external libraries, consider writing your own span
type, else
uint8_t const* buffer_begin, uint8_t const* buffer_end
can work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With