Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to deal with accents and strange characters in a database?

im trying to safe spanish words with accent in my database but it won't work, i have already tried:

1) changing conllation from tables and rows to utf8_spanish_ci and utf_unicode_ci.

2)adding a header tag with

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

3)adding

header("Content-Type: text/html;charset=utf-8");

in a php tag.

doing this in an xampp server in my laptop will work, but when i upload the database to a login monster server it wont save the accent properly.

edit: this is the connection im using:

    private function Connect()
    {
        //$this->settings = parse_ini_file("settings.ini.php");
        try 
        {
            # Read settings from INI file, set UTF8
            $this->pdo = new PDO('mysql:host=localhost;dbname=xxxxx;charset=utf8', 'xxxxx', 'xxxxxx', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));

            # We can now log any exceptions on Fatal error. 
            $this->pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

            # Disable emulation of prepared statements, use REAL prepared statements instead.
            $this->pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);

            # Connection succeeded, set the boolean to true.
            $this->bConnected = true;
        }
        catch (PDOException $e) 
        {
            # Write into log
            echo $this->ExceptionLog($e->getMessage());
            die();
        }
    }

Edit:

i can't save accent, it shows like strange characters like á = á

like image 835
2one2 Avatar asked Oct 19 '15 17:10

2one2


1 Answers

Collation affects text sorting only, it has no effect on actual character set of stored data.

I would recommend this configuration:

  1. Set the character set for the whole DB only, so you don't have to set it for each table separately. Character set is inherited from DB to tables to columns. Use utf8 as the character set.

  2. Set the character set for the DB connection. Execute these queries after you connect to the database:

    SET CHARACTER SET 'utf8'
    SET NAMES 'utf8'
    
  3. Set the character set for the page, using HTTP header and/or HTML meta tag. One of these is enough. Use utf-8 as the charset.

This should be enough.

If you want to have proper sorting of Spanish strings, set collation for the whole database. utf8_spanish_ci should work (ci means Case Insensitive). Without proper collation, accented Spanish characters would be sorted always last.

Note: it's possible that the character set of data you already have in a table is broken, because you character set configuration was wrong previously. You should check it using some DB client first to exclude this case. If it's broken, just re-insert your data with the right character set configuration.

How does character set work in a database

  • objects have a character set attribute, which can be set explicitly or it's inherited (server > database > table > column), so the best option is to set it for the whole database

  • client connection has also a character set attribute and it's telling the database in which encoding you're sending the data

If client connection's and target object's character sets are different, the data you're sending to the database are automatically converted from the connection's character set to the object's character set.

So if you have for example the data in utf8, but client connection set to latin1, the database will break the data, because it'll try to convert utf8 like it's latin1.

like image 102
David Ferenczy Rogožan Avatar answered Oct 06 '22 08:10

David Ferenczy Rogožan