Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# BinaryWriter length prefix - UTF7 encoding

Tags:

c#

I've got a project using memory mapped files to let two apps share data with each other. The producer app is written in C#, the consumer app talks plain old C. Both use VS2010.

MSDN says the "BinaryWriter.Write Method(String)" prepends the data with a UTF-7 encoded unsigned integer, and then writes the payload. This is exactly where I'm stuck. If I write a string which is 256 characters in length, the debugger of the C app shows me this byte sequence: 0x80 0x2 <256 times the payload char>. What's the best way to convert the length prefix to something that I can safely use in the consumer app?

Producer app:

using System;
using System.IO;
using System.IO.MemoryMappedFiles;
using System.Threading;
using System.Text;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        using (MemoryMappedFile mmf_read = MemoryMappedFile.CreateNew("mappedview", 4096))
        {
            using (MemoryMappedViewStream stream = mmf_read.CreateViewStream())
            {
                string str;
                BinaryWriter writer = new BinaryWriter(stream);

                str = string.Join("", Enumerable.Repeat("x", 256));

                writer.Write(str);
            }
        }
    }
}

Consumer app:

#include <windows.h>
#include <stdio.h>
#include <conio.h>
#include <tchar.h>
#pragma comment(lib, "user32.lib")

#define BUF_SIZE 4096
TCHAR szName[]=TEXT("Global\\mappedview");


int _tmain()
{
    HANDLE hMapFile;
    LPCSTR pBuf;

    hMapFile = OpenFileMapping(
               FILE_MAP_ALL_ACCESS,         // read/write access
               FALSE,                       // do not inherit the name
               szName);                     // name of mapping object

    if (hMapFile == NULL)
    {
        _tprintf(TEXT("Could not open file mapping object (%d).\n"),
         GetLastError());
        return 1;
    }

    pBuf = (LPCSTR) MapViewOfFile(hMapFile,     // handle to map object
           FILE_MAP_ALL_ACCESS,             // read/write permission
           0,
           0,
           BUF_SIZE);

    if (pBuf == NULL)
    {
        _tprintf(TEXT("Could not map view of file (%d).\n"),
                GetLastError());

        CloseHandle(hMapFile);
        return 1;
    }

    printf("Proc1: %s\n\n", pBuf);              // print mapped data

    UnmapViewOfFile(pBuf);

    CloseHandle(hMapFile);

    return 0;
}

br, Chris

like image 366
user2286339 Avatar asked Apr 16 '13 12:04

user2286339


2 Answers

Despite what the Microsoft documentation says,

  1. The prefix number written is in fact an LEB128 encoded count.
  2. This is a byte count, not a character count.

The Wiki page I linked gives you decoding code, but I would consider using my own scheme. You could convert the string to UTF8 manually using Encoding.GetBytes() and write that to the MMF, prefixing it with a normal unsigned short. That way you have complete control over everything.

like image 194
Matthew Watson Avatar answered Sep 18 '22 02:09

Matthew Watson


While the MSDN Documentation on BinaryWriter.Write states it “first writes the length of the string as a UTF-7 encoded unsigned integer”, it is wrong. First of all, UTF-7 is a string encoding, you cannot encode integers using UTF-7. What the documentation means (and the code does) is that it writes the length using variable-length 7-bit encoding, sometimes known as LEB128. In your specific case, the data bytes 80 02 mean the following:

1000 0000 0000 0010

Nbbb bbbb Eaaa aaaa

  • N set to one means this is not the final byte
  • E set to zero means this is the final byte
  • aaaaaaa and bbbbbbb are the real data; the result is therefore:

00000100000000

aaaaaaabbbbbbb

I.e. 100000000 in binary, which is 256 in decimal.

like image 22
Mormegil Avatar answered Sep 18 '22 02:09

Mormegil