Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Track.getSimilar: An invalid XML character (Unicode: 0x3) was found in the element…

I use the last.fm API:Api Last.fm

I have a list of songs (tracks) with their artists and I want to recover for every song like his song. the method Track.getSimilar(Artist, track, key) works perfectly. BUT when the artist or track is in Arabic, I get the following exception:

    [Fatal Error] :2583:13: An invalid XML character (Unicode: 0x3) was found in the element content of the document.
Exception in thread "main" de.umass.lastfm.CallException: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x3) was found in the element content of the document.
at de.umass.lastfm.Caller.call(Caller.java:268)
at de.umass.lastfm.Caller.call(Caller.java:189)
at de.umass.lastfm.Track.getSimilar(Track.java:369)

How can I solve this problem please.?

Thank you in advance

like image 802
FRIDI Mourad Avatar asked Feb 13 '23 06:02

FRIDI Mourad


2 Answers

Unicode code point 0x3 is a control character. It is not a normal character in any scripts or language systems so its presence is clearly an error, possibly in the database itself. It could be a result of a failed encoding conversion, characters to byte conversion or database write corruption.

XML cannot contain control characters - not even as entity references. Therefore your XML is not well formed and it cannot be processed with XML tools. Instead you need to remove that erroneous character with string processing or similar method.

At the same time you can check for all other characters that are illegal in XML. XML doesn't allow any character from Unicode surrogate blocks [0xD800 - 0xDFFF], non-characters 0xFFFE and 0xFFFF or characters below 0x20 (=control characters) execpt 0x9 [tab], 0xA [LF] and 0xD [CR]. This is formally stated here: http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char

like image 50
jasso Avatar answered Feb 15 '23 19:02

jasso


0x3 is the ASCII control code ETX, but some old programs might use it as a carriage return or something, so you can get this by pasting something from a source like that into a text field.

like image 33
Noumenon Avatar answered Feb 15 '23 19:02

Noumenon