A diamond (or square) with a question mark in the middle is not an emoticon, but a "replacement character." It is displayed whenever a character is not recognized in a document or webpage.
To remove the black diamond with white question mark symbolThe SUBSTITUTE (<text>, <old text>, <new text>) function substitutes <new text> for <old text> in a <text>. The UNICHAR (<numeric value>) function returns the Unicode character that is referenced by the given <numeric value>.
If you see that character (� U+FFFD "REPLACEMENT CHARACTER") it usually means that the text itself is encoded in some form of single byte encoding but interpreted in one of the unicode encodings (UTF8 or UTF16).
If it were the other way around it would (usually) look something like this: ä.
Probably the original encoding is ISO-8859-1, also known as Latin-1. You can check this without having to change your script: Browsers give you the option to re-interpret a page in a different encoding -- in Firefox use "View" -> "Character Encoding".
To make the browser use the correct encoding, add an HTTP header like this:
header("Content-Type: text/html; charset=ISO-8859-1");
or put the encoding in a meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Alternatively you could try to read from the database in another encoding (UTF-8, preferably) or convert the text with iconv()
.
I also faced this � issue. Meanwhile I ran into three cases where it happened:
substr()
I was using substr()
on a UTF8 string which cut UTF8 characters, thus the cut chars could not be displayed correctly. Use mb_substr($utfstring, 0, 10, 'utf-8');
instead. Credits
htmlspecialchars()
Another problem was using htmlspecialchars()
on a UTF8 string. The fix is to use: htmlspecialchars($utfstring, ENT_QUOTES, 'UTF-8');
preg_replace()
Lastly I found out that preg_replace()
can lead to problems with UTF. The code $string = preg_replace('/[^A-Za-z0-9ÄäÜüÖöß]/', ' ', $string);
for example transformed the UTF string "F(×)=2×-3" into "F � 2� ". The fix is to use mb_ereg_replace()
instead.
I hope this additional information will help to get rid of such problems.
This is a charset issue. As such, it can have gone wrong on many different levels, but most likely, the strings in your database are utf-8 encoded, and you are presenting them as iso-8859-1. Or the other way around.
The proper way to fix this problem, is to get your character-sets straight. The simplest strategy, since you're using PHP, is to use iso-8859-1 throughout your application. To do this, you must ensure that:
charset=iso-8859-1
header
.accept-charset
attribute on your <form>
elements.If you already have data in your database, you should be aware that they are probably messed up already. If you are not already in production phase, just wipe it all and start over. Otherwise you'll have to do some data cleanup.
When a web-server serves a file (A HTML-document), it sends some information, that isn't presented directly in the browser. This is known as HTTP-headers. One such header, is the Content-Type
header, which specifies the mimetype of the file (Eg. text/html
) as well as the encoding (aka charset).
While most webservers will send a Content-Type
header with charset
info, it's optional. If it isn't present, the browser will instead interpret any meta-tags with http-equiv="Content-Type"
. It's important to realise that the meta-tag is only interpreted if the webserver doesn't send the header. In practice this means that it's only used if the page is saved to disk and then opened from there.
This page has a very good explanation of these things.
As mentioned in earlier answers, it is happening because your text has been written to the database in iso-8859-1
encoding, or any other format.
So you just need to convert the data to utf8
before outputting it.
$text = “string from database”;
$text = utf8_encode($text);
echo $text;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With