Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Encoding of BinaryReader/Writer?

The .NET BinaryReader/BinaryWriter classes can be constructed with specifying an Encoding to use for String-related operations.

I was implementing custom string formats with extension methods, but would yet implement them in a way they respect the Encoding specified when instantiating the BinaryReader/Writer.

There does not seem to be a way to retrieve the Encoding from the reader/writer, not even when inheriting from their class. I could only inherit from them to intercept the passed encoding by recreating all their constructors. I looked into the .NET source code, and it is only used to instantiate a Decoder class (in case of the BinaryReader), but I can't access that one also.

Do I lose to a shortcoming in those classes here? Can I hack into them with reflection?

like image 564
Ray Avatar asked Apr 03 '15 16:04

Ray


2 Answers

Looking at the source code for BinaryReader, I see the constructor is defined as follows:

    public BinaryReader(Stream input, Encoding encoding, bool leaveOpen) {
        if (input==null) {
            throw new ArgumentNullException("input");
        }
        if (encoding==null) {
            throw new ArgumentNullException("encoding");
        }
        if (!input.CanRead)
            throw new ArgumentException(Environment.GetResourceString("Argument_StreamNotReadable"));
        Contract.EndContractBlock();
        m_stream = input;
        m_decoder = encoding.GetDecoder();
        m_maxCharsSize = encoding.GetMaxCharCount(MaxCharBytesSize);
        int minBufferSize = encoding.GetMaxByteCount(1);  // max bytes per one char
        if (minBufferSize < 16) 
            minBufferSize = 16;
        m_buffer = new byte[minBufferSize];
        // m_charBuffer and m_charBytes will be left null.

        // For Encodings that always use 2 bytes per char (or more), 
        // special case them here to make Read() & Peek() faster.
        m_2BytesPerChar = encoding is UnicodeEncoding;
        // check if BinaryReader is based on MemoryStream, and keep this for it's life
        // we cannot use "as" operator, since derived classes are not allowed
        m_isMemoryStream = (m_stream.GetType() == typeof(MemoryStream));
        m_leaveOpen = leaveOpen;

        Contract.Assert(m_decoder!=null, "[BinaryReader.ctor]m_decoder!=null");
    }

So it looks like the encoding itself isn't actually retained anywhere. The class just stores a decoder that is derived from the encoding. m_decoder is defined as follows in the class:

    private Decoder  m_decoder;

You can't access the private variable. Doing a search for that variable in the rest of the class shows it's used in a few places internally, but never returned, so I don't think you can access it anywhere in your derived class without doing some kind of crazy reflection/disassembly thing. It would have to be defined as protected for you to access it. Sorry.

Edit:

There is almost certainly a better way to solve your problem than using reflection to access the private m_decoder variable. And even if you did, it might not get you the encoding, as you noted in the comments. However, if you still want to do it anyway, see this StackOverflow answer on how to access private members with reflection.

like image 166
Joshua Carmody Avatar answered Nov 16 '22 02:11

Joshua Carmody


If subclassing and intercepting the Encoding in the constructors is even remotely feasible in your scenario, I'd prefer it over potentially unstable reflection hacks.

However, if you must go the reflection route for some reason, here are some pointers I found from the BinaryReader source code you referenced:

  • The Decoder class itself apparently does not hold any reference to an Encoding but:
  • The Decoder instance is created by calling encoding.GetDecoder() (line 65)
  • Which returns an instance of an internal class DefaultDecoder
  • Which does hold the Encoding in m_encoding
like image 3
nodots Avatar answered Nov 16 '22 00:11

nodots