Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSON.Net DeserializeObject Text Encoding

Tags:

c#

json.net

When i try to deserialize an object from a file, it converts Turkish characters such as "ğ" to question marks.

So tried this :

JsonConvert.DeserializeObject<List<MyClass>>(json, new JsonSerializerSettings() 
{ Culture = new System.Globalization.CultureInfo("tr-TR")  });

but it didn't work. Is there any way to change character encoding in Json.Net ?

like image 999
paroxit Avatar asked Apr 09 '14 21:04

paroxit


2 Answers

Be sure that you declare UTF-8 on the WebClient when you create it, if you use a WebClient.

new WebClient() { Encoding = Encoding.UTF8 }
like image 59
Wille Esteche Avatar answered Sep 25 '22 18:09

Wille Esteche


in the theory you got an charset encoding/decoding problem.

the Cause: the content you try to read has been encoded using a charset like iso-8859-1 or iso-8859-15. and you'll try to read (decode) it directly to an "UTF-8" character Model. Of course it won't work because UTF-8 because UTF-8 won't in a miracle recognize your special chars (Ä,Ü, Ö, and so on..). UTF-8 is no guesser for character coding.

Solution:

1- (Re)encode your content( e.g "Björn Nilsson") with its corresponding charset (iso-8859-1/iso-8859-15) into Byte collection.

2- Decode your content with into "UTF-8" based charset.

here an Helper Class as example:

using System;
using System.Collections.Generic;
using System.Text;

    namespace csharp.util.charset
    {
        public class SysUtil
        {
            /// <summary>
            /// Convert a string from one charset to another charset
            /// </summary>
            /// <param name="strText">source string</param>
            /// <param name="strSrcEncoding">original encoding name</param>
            /// <param name="strDestEncoding">dest encoding name</param>
            /// <returns></returns>
            public static String StringEncodingConvert(String strText, String strSrcEncoding, String strDestEncoding)
            {
                System.Text.Encoding srcEnc = System.Text.Encoding.GetEncoding(strSrcEncoding);
                System.Text.Encoding destEnc = System.Text.Encoding.GetEncoding(strDestEncoding);
                byte[] bData=srcEnc.GetBytes(strText);
                byte[] bResult = System.Text.Encoding.Convert(srcEnc, destEnc, bData);
                return destEnc.GetString(bResult);
            }

        }
    }

Usage:

in your (JSON-, XML, other) serializer/deserializer classes just convert your content like that

String content = "Björn Nilsson";
SysUtil.StringEncodingConvert(content, "ISO-8859-1","UTF-8");

you could try to make your calls in your deserializer (if they really do what they mean):

public class JsonNetSerializerFactory :ISerializerFactory 
{
    public ISerializer<T> Create<T>()
    {
        return new JsonNetSerializer<T>();
    }
    public class JsonNetSerializer<T> : ISerializer<T>
    {
        public T Deserialize(string input, String fromCharset, String toCharset)

        {
           String changedString = SysUtil.StringEncodingConvert(input, fromCharset,toCharset);

            return JsonConvert.DeserializeObject<T>(changedString  );
        }

        public IList<T> DeserializeList(string input, String fromCharset, String toCharset)
        {
         String changedString =  SysUtil.StringEncodingConvert(input, fromCharset,toCharset);

            return JsonConvert.DeserializeObject<IList<T>>(changedString);
        }
    }
}
like image 28
csharpwinphonexaml Avatar answered Sep 22 '22 18:09

csharpwinphonexaml