Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Putting Message in Websphere MQ via C# has different data length than manually putting the same message

Tags:

c#

xml

jms

ibm-mq

MQMessage queueMessage = new MQMessage();
                queueMessage.WriteString(strInputMsg);
                queueMessage.Format = MQC.MQFMT_STRING;
                MQPutMessageOptions queuePutMessageOptions = new MQPutMessageOptions();
                Queue.Put(queueMessage, queuePutMessageOptions);

Using C#, with the above code, when I input the message into the queue, the data length of the message is 3600.

When I manually input the message into the queue by right clicking the queue and selecting Put Test Message option, the data length of the message is 1799.

I am really confused why this is the case. The message in both cases is an xml string with declaration. In Notepad++, there are 1811 characters including the declaration. When I view the message in debugger before I input into the queue, the message is converted into xml without any line or return carriages.

I created the xml string using:

//converts string message into xml by serializing it
 public string GetMessage(MyMessage messageInstance)
{

// Serialize the request
            XmlSerializer xsr = new XmlSerializer(typeof(MyMessage));
            MemoryStream memoryStream = new MemoryStream();
            XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);
            xsr.Serialize(xmlTextWriter, messageInstance);

            memoryStream = (MemoryStream)xmlTextWriter.BaseStream;
            string XmlizedString = new UTF8Encoding().GetString((memoryStream.ToArray());


            // Encode the xml
            Encoding utf = Encoding.UTF8;
            byte[] utfBytes = utf.GetBytes(XmlizedString);

            // Load the document (XmlResolver is set to null to ingore DTD)
            XmlDocument xmlDoc = new XmlDocument();
            xmlDoc.XmlResolver = null;
            xmlDoc.LoadXml(utf.GetString(utfBytes));
            return utf.GetString(utfBytes);

Am I missing anything in my C# implementation that is adding extra characters?

Thanks.

like image 862
InfoLearner Avatar asked Aug 11 '11 17:08

InfoLearner


1 Answers

As @Matten suggests one problem could be character encoding.

The default value for the CharacterSet property is 1200 (UNICODE) and WriteString converts to the code page specified by CharacterSet.

Code Page 1200 is UTF-16 little-endian so you are likely to get two bytes per character. It is certainly possible that "Put Test Message" uses some other encoding that uses one-byte per character for common characters.

Assuming that the 3600 and 1799 lengths are counted in bytes, they could represent 1800 UTF-16LE characters and 1799 UTF-8 characters (or 1799 ASCII characters or 1799 EBCDIC characters...).

That still leaves us with a one character difference in length. Perhaps WriteString includes a terminating NULL character in the string written?

Are you sure you trust the count Notepad++ gives you? If Put Test Message placed 1799 characters into a message there were probably 1799 characters in the data you supplied to it.

Edit: Assuming the encoding theory is correct, you could shorten the message by using a different encoding. How short an encoding would make a particular message would depend on the actual contents of the string.

For example, you could use an ASCII encoding to get one byte per character.

MQMessage queueMessage = new MQMessage();
queueMessage.CharacterSet = 437;  // Set code page to ASCII

That would shorten your message to 1800 bytes if all the characters in your xml string had an ASCII representation.

An alternative would be to use UTF-8 encoding.

MQMessage queueMessage = new MQMessage();
queueMessage.CharacterSet = 1208;  // Set code page to UTF-8

Using UTF-8 has the advantage that (unlike ASCII) all characters have a representation (for certain values of 'all'). The disadvantage is that some characters require two, three or even four bytes to represent them. The most common characters are encoded in one byte, then the next most common characters are encoded in two bytes and so on.

In the best case a UTF-8 encoding would also give you 1800 bytes. In the worst case it would give you 7200 bytes but that seems very unlikely unless you are using something like Klingon!

like image 125
Frank Boyne Avatar answered Sep 19 '22 17:09

Frank Boyne