Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in encoding mysql -> How can I reconvert it to something else?

I started a website some time ago using the wrong CHARSET in my DB and site. The HTML was set to ISO... and the DB to Latin... , the page was saved in Western latin... a big mess.

The site is in French, so I created a function that replaced all accents like "é" to "é". Which solved the issue temporarily.

I just learned a lot more about programming, and now my files are saved as Unicode UTF-8, the HTML is in UTF-8 and my MySQL table columns are set to ut8_encoding...

I tried to move back the accents to "é" instead of the "é", but I get the usual charset issues with the (?) or weird characters "â" both in MySQL and when the page is displayed.

I need to find a way to update my sql, through a function that cleans the strings so that it can finally go back to normal. At the moment my function looks like this but doesn't work:

function stripAcc3($value){

 $ent =   array(
          'à'=>'à', 
          'â'=>'â', 
            'ù'=>'ù', 
          'û'=>'û',
            'é'=>'é', 
          'è'=>'è', 
          'ê'=>'ê', 
            'ç'=>'ç', 
            'Ç'=>'Ç', 
            "î"=>'î', 
            "Ï"=>'ï', 
            "ö"=>'ö', 
            "ô"=>'ô', 
            "ë"=>'ë', 
            "ü"=>'ü', 
            "Ä"=>'ä',
            "€"=>'€',
          "′"=> "'",
          "é"=> "é"
        );

    return strtr($value, $ent);
}

Any help welcome. Thanks in advance. If you need code, please tell me which part.

UPDATE

If you want the bounty points, I need detailed instructions on how to do it. Thanks.

like image 353
denislexic Avatar asked May 10 '11 13:05

denislexic


1 Answers

Try using the following function instead, it should handle all the issues you described:

function makeStringUTF8($data)
{
    if (is_string($data) === true)
    {
        // has html entities?
        if (strpos($data, '&') !== false)
        {
            // if so, revert back to normal
            $data = html_entity_decode($data, ENT_QUOTES, 'UTF-8');
        }

        // make sure it's UTF-8
        if (function_exists('iconv') === true)
        {
            return @iconv('UTF-8', 'UTF-8//IGNORE', $data);
        }

        else if (function_exists('mb_convert_encoding') === true)
        {
            return mb_convert_encoding($data, 'UTF-8', 'UTF-8');
        }

        return utf8_encode(utf8_decode($data));
    }

    else if (is_array($data) === true)
    {
        $result = array();

        foreach ($data as $key => $value)
        {
            $result[makeStringUTF8($key)] = makeStringUTF8($value);
        }

        return $result;
    }

    return $data;
}

Regarding the specific instructions of how to use this, I suggest the following:

  1. export your old latin database (I hope you still have it) contents as an SQL/CSV dump *
  2. use the above function on the file contents and save the result on another file
  3. import the file you generated in the previous step into the UTF-8 aware schema / database

* Example:

file_put_contents('utf8.sql', makeStringUTF8(file_get_contents('latin.sql')));

This should do it, if it doesn't let me know.

like image 109
Alix Axel Avatar answered Nov 15 '22 03:11

Alix Axel