Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert String with “ (ISO-8859-1) characters to normal (UTF-8)characters?

<li>Jain R.K. and Iyengar S.R.K., “Advanced Engineering Mathematicsâ€, Narosa Publications,</li>

i have lot a raw html string in database. all the text have these weird characters. how can i convert to normal text for saving back it back in database.

$final = '<li>Jain R.K. and Iyengar S.R.K., “Advanced Engineering Mathematicsâ€, Narosa Publications,</li>';
$final = utf8_encode($final);

$final = htmlspecialchars_decode($final);

$final = html_entity_decode($final, ENT_QUOTES, "UTF-8");

$final = utf8_decode($final);

echo $final;

i tried above code, it displays correctly in web browser but still saving the same weird characters in database.

the charset of database is utf-8

like image 852
muthukrishnan Avatar asked Dec 31 '17 19:12

muthukrishnan


People also ask

How do I convert UTF-8 to ISO-8859-1?

Going backwards from UTF-8 to ISO-8859-1 will cause "replacement characters" (�) to appear in your text when unsupported characters are found. byte[] utf8 = ... byte[] latin1 = new String(utf8, "UTF-8"). getBytes("ISO-8859-1"); You can exercise more control by using the lower-level Charset APIs.

What is this â € œ?

This answer is not useful. Show activity on this post. “ is "Mojibake" for “ . You could try to avoid the non-ascii quotes, but that would only delay getting back into trouble.

How do I convert string to UTF?

In order to convert a String into UTF-8, we use the getBytes() method in Java. The getBytes() method encodes a String into a sequence of bytes and returns a byte array. where charsetName is the specific charset by which the String is encoded into an array of bytes.

How do I change my UTF-8 character set?

Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.


3 Answers

“ is "Mojibake" for . You could try to avoid the non-ascii quotes, but that would only delay getting back into trouble.

You need to use utf8mb4 in your tables and connections. See this for the likely causes of Mojibake.

like image 151
Rick James Avatar answered Oct 18 '22 07:10

Rick James


$final = '<li>Jain R.K. and Iyengar S.R.K., “Advanced Engineering Mathematicsâ€, Narosa Publications,</li>';

$final = str_replace("Â", "", $final);
$final = str_replace("’", "'", $final);
$final = str_replace("“", '"', $final);
$final = str_replace('–', '-', $final);
$final = str_replace('â€', '"', $final);

for past datas, i replaced the weird characters with UTF-8 characters.

for future datas, i made the charset to utf8 in php, html and databases connections.

like image 8
muthukrishnan Avatar answered Oct 18 '22 05:10

muthukrishnan


It safer to use ftfy tool to fix texts https://ftfy.readthedocs.io/en/latest/

like image 3
mahnunchik Avatar answered Oct 18 '22 05:10

mahnunchik