Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Constructing and sending binary data over network

I am creating a command-line client for minecraft. There is a full spec on the protocol that can be found here: http://mc.kev009.com/Protocol. To answer your question beforehand, yes I am a bit of a C++ noob.

I have various issues in implementing this protocol, of which each critical.

  1. The protocol says that all types are big-endian. I have no idea how I should check whether my data is little-endian and if yes how to convert to big-endian.
  2. The string datatype is a bit weird one. It's a modified UTF-8 string which is preceded by a short containing the string length. I have no idea how I should pack this into a simple char[] array nor how to convert my simple strings into modified UTF-8 ones.
  3. Even if I knew how to convert my data to big-endian and create modified UTF-8 strings I still don't know how to pack this up into a char[] array and send this as a package. All I have done before is simple HTTP networking which is plain ASCII.

Explanations, links, related function names and short snippets much appreciated!

EDIT

1 and 3 is answered now. 1 is answered below by user470379. 3 is answered by this AWESOME thread that explains what I want to do very well: http://cboard.cprogramming.com/networking-device-communication/68196-sending-non-char*-data.html I'm not sure about the modified UTF-8 yet though.

like image 959
orlp Avatar asked Dec 21 '22 16:12

orlp


2 Answers

A traditional approach is to define a C++ message structure for each protocol message and implement serialization and deserialization functions for it. For example Login Request can be represented like this:

#include <string>
#include <stdint.h>

struct LoginRequest
{
    int32_t protocol_version;
    std::string username;
    std::string password;
    int64_t map_seed;
    int8_t dimension;
};

Now serialization functions are required. First it needs serialization functions for integers and strings, since these are the types of members in LoginRequest.

Integer serialization functions need to do conversions to and from big-endian representation. Since members of the message are copied to and from the buffer, the reversal of the byte order can be done while copying:

#include <boost/detail/endian.hpp>
#include <algorithm>

#ifdef BOOST_LITTLE_ENDIAN

    inline void xcopy(void* dst, void const* src, size_t n)
    {
        char const* csrc = static_cast<char const*>(src);
        std::reverse_copy(csrc, csrc + n, static_cast<char*>(dst));
    }

#elif defined(BOOST_BIG_ENDIAN)

    inline void xcopy(void* dst, void const* src, size_t n)
    {
        char const* csrc = static_cast<char const*>(src);
        std::copy(csrc, csrc + n, static_cast<char*>(dst));
    }

#endif

// serialize an integer in big-endian format
// returns one past the last written byte, or >buf_end if would overflow
template<class T>
typename boost::enable_if<boost::is_integral<T>, char*>::type serialize(T val, char* buf_beg, char* buf_end)
{
    char* p = buf_beg + sizeof(T);
    if(p <= buf_end)
        xcopy(buf_beg, &val, sizeof(T));
    return p;
}

// deserialize an integer from big-endian format
// returns one past the last written byte, or >buf_end if would underflow (incomplete message)
template<class T>
typename boost::enable_if<boost::is_integral<T>, char const*>::type deserialize(T& val, char const* buf_beg, char const* buf_end)
{
    char const* p = buf_beg + sizeof(T);
    if(p <= buf_end)
        xcopy(&val, buf_beg, sizeof(T));
    return p;
}

And for strings (handling modified UTF-8 the same way as asciiz strings):

// serialize a UTF-8 string
// returns one past the last written byte, or >buf_end if would overflow
char* serialize(std::string const& val, char* buf_beg, char* buf_end)
{
    int16_t len = val.size();
    buf_beg = serialize(len, buf_beg, buf_end);
    char* p = buf_beg + len;
    if(p <= buf_end)
        memcpy(buf_beg, val.data(), len);
    return p;
}

// deserialize a UTF-8 string
// returns one past the last written byte, or >buf_end if would underflow (incomplete message)
char const* deserialize(std::string& val, char const* buf_beg, char const* buf_end)
{
    int16_t len;
    buf_beg = deserialize(len, buf_beg, buf_end);
    if(buf_beg > buf_end)
        return buf_beg; // incomplete message
    char const* p = buf_beg + len;
    if(p <= buf_end)
        val.assign(buf_beg, p);
    return p;
}

And a couple of helper functors:

struct Serializer
{
    template<class T>
    char* operator()(T const& val, char* buf_beg, char* buf_end)
    {
        return serialize(val, buf_beg, buf_end);
    }
};

struct Deserializer
{
    template<class T>
    char const* operator()(T& val, char const* buf_beg, char const* buf_end)
    {
        return deserialize(val, buf_beg, buf_end);
    }
};

Now using these primitive functions we can readily serialize and deserialize LoginRequest message:

template<class Iterator, class Functor>
Iterator do_io(LoginRequest& msg, Iterator buf_beg, Iterator buf_end, Functor f)
{
    buf_beg = f(msg.protocol_version, buf_beg, buf_end);
    buf_beg = f(msg.username, buf_beg, buf_end);
    buf_beg = f(msg.password, buf_beg, buf_end);
    buf_beg = f(msg.map_seed, buf_beg, buf_end);
    buf_beg = f(msg.dimension, buf_beg, buf_end);
    return buf_beg;
}

char* serialize(LoginRequest const& msg, char* buf_beg, char* buf_end)
{
    return do_io(const_cast<LoginRequest&>(msg), buf_beg, buf_end, Serializer());
}

char const* deserialize(LoginRequest& msg, char const* buf_beg, char const* buf_end)
{
    return do_io(msg, buf_beg, buf_end, Deserializer());
}

Using the helper functors above and representing input/output buffers as char iterator ranges only one function template is required to do both serialization and deserialization of the message.

And putting all together, usage:

int main()
{
    char buf[0x100];
    char* buf_beg = buf;
    char* buf_end = buf + sizeof buf;

    LoginRequest msg;

    char* msg_end_1 = serialize(msg, buf, buf_end);
    if(msg_end_1 > buf_end)
        ; // more buffer space required to serialize the message

    char const* msg_end_2 = deserialize(msg, buf_beg, buf_end);
    if(msg_end_2 > buf_end)
        ; // incomplete message, more data required
}
like image 61
Maxim Egorushkin Avatar answered Jan 13 '23 07:01

Maxim Egorushkin


For #1, you'll need to use ntohs and friends. Use the *s (short) versions for 16-bit integers, and the *l (long) versions for 32-bit integers. The hton* (host to network) will convert outgoing data to big-endian independently of the endianness of the platform you're on, and ntoh* (network to host) will convert incoming data back (again, independent of platform endianness)

like image 26
user470379 Avatar answered Jan 13 '23 08:01

user470379