Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove html entities from a databases

Due to errors of my predecessors a (MySQL) database I would like to use contains a lot of HTML entities (e.g. € instead of ).

As the database should contain raw data (a database shouldn't have anything to do with HTML) I want to remove them from the DB and store it in proper UTF8, the collocation is already that.

What would be a good way to fix this? The only thing I can think of is to write a PHP script that gets all the data, runs it through html_entity_decode() and writes it back. It's doable since it's a one-time-operation and the DB is only about 100MB large, but it's still less than optimal.

Any ideas?

like image 496
dtech Avatar asked Oct 23 '22 17:10

dtech


2 Answers

Since no-one could provide a satisfying SQL-only solution, I solved it with a script similar to this one. Note that it only works if all the tables you use it on have a primary key, but this will usually be the case

<?php
// Specify which columns need to be de-entitiezed
$affected = array(
    'table1' => array('column1', 'column2'),
    'table2' => array('column1', 'column2'),
);

// Make database connection
$db = new PDO("mysql:dbname=yourdb;host=yourhost", "user", "pass");

foreach($affected as $table => $columns){
    // Start a transaction for each table
    $db->beginTransaction();

    // Find the table primary key. PHP5.4 syntax!
    $pk = $db->query("SHOW INDEX FROM " . $table . " WHERE Key_name = 'PRIMARY'")->fetch()[0];

    foreach($columns as $column){
        // Construct a prepared statement for this column
        $ps = $db->prepare("UPDATE " . $table . " SET " . $column . " . = ? WHERE " . $pk . " = ?");

        // Go through all rows
        foreach( $db->query("SELECT " . $column . ", " . $pk . " FROM " . $table) as $row){
            $row[0] = html_entity_decode($row[0]);  // Actual processing
            $ps->execute($row);
        }
    }

    // Everything went well for this table, commit
    $db->commit();
}
?>
like image 155
dtech Avatar answered Oct 27 '22 11:10

dtech


I tnink u need to create a mysql procedure. (with SELECT loop and update replace)
REPLACE(TextString, '&apos;','"') ;

like image 33
Dezigo Avatar answered Oct 27 '22 10:10

Dezigo