Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Configuring the .NET WCF UTF-8 deserializer to modify/discard non-shortest form chars instead of throwing an exception?

We have a SOAP web service hosted via WCF.

One of the clients we receive data from occasionally encodes UTF-8 using non-shortest form (See http://www.unicode.org/versions/corrigendum1.html for a bit of info on this).

It's not easy to modify the client because these non-shortest form characters are not being encoded by our code.

Instead we'd like to edit the WCF service to discard these characters, replace them with other placeholder char, or even accept the non-shortest form characters. Any of these would be acceptable for our use case, although the former options would be preferred since they lessen any security risk.

Looking at the stack trace:

System.ServiceModel.Dispatcher.NetDispatcherFaultException: The formatter threw an exception while trying to deserialize the message: There was an error while trying to deserialize parameter http://www.mydomain.com/:mytype. The InnerException message was 'There was an error deserializing the object of type MyNamespace.MyType`1[[System.String, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c461934e089]]. '????' contains invalid UTF8 bytes.'.  Please see InnerException for more details. ---> System.Runtime.Serialization.SerializationException: There was an error deserializing the object of type MyNamespace.MyType`1[[System.String, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c461934e089]]. '????' contains invalid UTF8 bytes. ---> System.Xml.XmlException: '????' contains invalid UTF8 bytes. ---> System.Text.DecoderFallbackException: Unable to translate bytes [C0] at index 0 from specified code page to Unicode.
   at System.Text.DecoderExceptionFallbackBuffer.Throw(Byte[] bytesUnknown, Int32 index)
   at System.Text.DecoderExceptionFallbackBuffer.Fallback(Byte[] bytesUnknown, Int32 index)
   at System.Text.DecoderFallbackBuffer.InternalFallback(Byte[] bytes, Byte* pBytes, Char*& chars)
   at System.Text.UTF8Encoding.FallbackInvalidByteSequence(Byte*& pSrc, Int32 ch, DecoderFallbackBuffer fallback, Char*& pTarget)
   at System.Text.UTF8Encoding.GetChars(Byte* bytes, Int32 byteCount, Char* chars, Int32 charCount, DecoderNLS baseDecoder)
   at System.Text.UTF8Encoding.GetChars(Byte[] bytes, Int32 byteIndex, Int32 byteCount, Char[] chars, Int32 charIndex)
   at System.Xml.XmlConverter.ToChars(Byte[] buffer, Int32 offset, Int32 count, Char[] chars, Int32 charOffset)
   --- End of inner exception stack trace ---
   at System.Xml.XmlConverter.ToChars(Byte[] buffer, Int32 offset, Int32 count, Char[] chars, Int32 charOffset)
   at System.Xml.XmlBufferReader.GetChars(Int32 offset, Int32 length, Char[] chars)
   at System.Xml.ValueHandle.GetCharsText()
   at System.Xml.ValueHandle.GetString()
   at System.Xml.XmlBaseReader.get_Value()
   at System.Xml.XmlDictionaryReader.ReadContentAsString(Int32 maxStringContentLength)
   at System.Xml.XmlBaseReader.ReadContentAsString()
   at System.Xml.XmlBaseReader.ReadElementContentAsString()
   at System.Runtime.Serialization.XmlReaderDelegator.ReadElementContentAsString()
   at ReadOrbliteCompatibleArrayOfstringFromXml(XmlReaderDelegator , XmlObjectSerializerReadContext , XmlDictionaryString , XmlDictionaryString , CollectionDataContract )
   at System.Runtime.Serialization.CollectionDataContract.ReadXmlValue(XmlReaderDelegator xmlReader, XmlObjectSerializerReadContext context)
   at System.Runtime.Serialization.XmlObjectSerializerReadContext.InternalDeserialize(XmlReaderDelegator reader, String name, String ns, DataContract& dataContract)
   at System.Runtime.Serialization.XmlObjectSerializerReadContext.InternalDeserialize(XmlReaderDelegator xmlReader, Type declaredType, DataContract dataContract, String name, String ns)
   at System.Runtime.Serialization.DataContractSerializer.InternalReadObject(XmlReaderDelegator xmlReader, Boolean verifyObjectName)
   at System.Runtime.Serialization.XmlObjectSerializer.ReadObjectHandleExceptions(XmlReaderDelegator reader, Boolean verifyObjectName)
   --- End of inner exception stack trace ---
   at System.Runtime.Serialization.XmlObjectSerializer.ReadObjectHandleExceptions(XmlReaderDelegator reader, Boolean verifyObjectName)
   at System.ServiceModel.Dispatcher.DataContractSerializerOperationFormatter.DeserializeParameterPart(XmlDictionaryReader reader, PartInfo part, Boolean isRequest)
   --- End of inner exception stack trace ---
   at System.ServiceModel.Dispatcher.DataContractSerializerOperationFormatter.DeserializeParameterPart(XmlDictionaryReader reader, PartInfo part, Boolean isRequest)
   at System.ServiceModel.Dispatcher.DataContractSerializerOperationFormatter.DeserializeParameters(XmlDictionaryReader reader, PartInfo[] parts, Object[] parameters, Boolean isRequest)
   at System.ServiceModel.Dispatcher.DataContractSerializerOperationFormatter.DeserializeBody(XmlDictionaryReader reader, MessageVersion version, String action, MessageDescription messageDescription, Object[] parameters, Boolean isRequest)
   at System.ServiceModel.Dispatcher.OperationFormatter.DeserializeBodyContents(Message message, Object[] parameters, Boolean isRequest)
   at System.ServiceModel.Dispatcher.OperationFormatter.DeserializeRequest(Message message, Object[] parameters)
   at System.ServiceModel.Dispatcher.DispatchOperationRuntime.DeserializeInputs(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)

It looks like what we want is to switch the decoder fallback from System.Text.DecoderExceptionFallback to System.Text.DecoderReplacementFallback.

Can somebody point me to the correct place and way to override the default exception fallback with a replacement fallback?

like image 708
Lawrence Johnston Avatar asked Nov 24 '10 18:11

Lawrence Johnston


1 Answers

One possibility to explore is creating a custom encoder. Take a look at http://msdn.microsoft.com/en-us/library/ms751486.aspx


Added by Lawrence Johnston:

This is the meat of the code I ended up using. My MessageEncoder subclass just wraps a MessageEncoder instance and passes all calls except the ones to ReadMessage through to the wrapped class.

public override Message ReadMessage(ArraySegment<Byte> buffer, BufferManager bufferManager, String contentType) {
    // Convert buffer to stream and pass to overload.
    byte[] msgContents = new byte[buffer.Count];
    Array.Copy(buffer.Array, buffer.Offset, msgContents, 0, msgContents.Length);
    bufferManager.ReturnBuffer(buffer.Array);

    MemoryStream stream = new MemoryStream(msgContents);
    return ReadMessage(stream, int.MaxValue);
}

public override Message ReadMessage(Stream stream, int maxSizeOfHeaders, string contentType) {
    // The only new functionality in this class is to pass the stream through a 
    // StreamReader with the encoding set to *not* throw on invalid bytes. 
    XmlReader reader = XmlReader.Create(new StreamReader(stream, new UTF8Encoding(false, false)));
    return Message.CreateMessage(reader, maxSizeOfHeaders, MessageVersion);
}
like image 87
Eugene Osovetsky Avatar answered Oct 20 '22 17:10

Eugene Osovetsky