Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Got error "invalid character 'ï' looking for beginning of value” from json.Unmarshal

Tags:

json

go

I use a Golang HTTP request to get json output as follow. The web service I am trying to access is Micrsoft Translator https://msdn.microsoft.com/en-us/library/dn876735.aspx

//Data struct of TransformTextResponse
type TransformTextResponse struct {
    ErrorCondition   int    `json:"ec"`       // A positive number representing an error condition
    ErrorDescriptive string `json:"em"`       // A descriptive error message
    Sentence         string `json:"sentence"` // transformed text
}


//some code ....
body, err := ioutil.ReadAll(response.Body)
defer response.Body.Close()
if err != nil {
    return "", tracerr.Wrap(err)
}

transTransform = TransformTextResponse{}
err = json.Unmarshal(body, &transTransform)
if err != nil {
   return "", tracerr.Wrap(err)
}

I got an error from invalid character 'ï' looking for beginning of value

So, I try to print the body as string fmt.Println(string(body)), it show:

{"ec":0,"em":"OK","sentence":"This is too strange i just want to go home soon"}

It seems the data doesn't have any problem, so I tried to create the same value by jason.Marshal

transTransform := TransformTextResponse{}
transTransform.ErrorCondition = 0
transTransform.ErrorDescriptive = "OK"
transTransform.Sentence = "This is too strange i just want to go home soon"
jbody, _ := json.Marshal(transTransform)

I found the original data might have problem, so I try to compare two data in []byte format.

Data from response.Body:

[239 187 191 123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]

Data from json.Marshal

[123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]

Any idea how I parse this response.Body and Unmarshal it into data structure?

like image 534
Evan Lin Avatar asked Jul 14 '15 05:07

Evan Lin


1 Answers

The server is sending you a UTF-8 text string with a Byte Order Mark (BOM). The BOM identifies that the text is UTF-8 encoded, but it should be removed before decoding.

This can be done with the following line (using package "bytes"):

body = bytes.TrimPrefix(body, []byte("\xef\xbb\xbf")) // Or []byte{239, 187, 191}

PS. The error referring to ï is because the UTF-8 BOM interpreted as an ISO-8859-1 string will produce the characters .

like image 104
ANisus Avatar answered Nov 05 '22 19:11

ANisus