Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to look up value based on a key in php [duplicate]

With a list of around 100,000 key/value pairs (both string, mostly around 5-20 characters each) I am looking for a way to efficiently find the value for a given key.

This needs to be done in a php website. I am familiar with hash tables in java (which is probally what I would do if working in java) but am new to php.

I am looking for tips on how I should store this list (in a text file or in a database?) and search this list.

The list would have to be updated occasionally but I am mostly interested in look up time.

like image 887
user552007 Avatar asked Dec 07 '22 00:12

user552007


1 Answers

You could do it as a straight PHP array, but Sqlite is going to be your best bet for speed and convenience if it is available.

PHP array

Just store everything in a php file like this:

<?php
return array(
    'key1'=>'value1',
    'key2'=>'value2',
    // snip
    'key100000'=>'value100000',
);

Then you can access it like this:

<?php
$s = microtime(true); // gets the start time for benchmarking

$data = require('data.php');
echo $data['key2'];

var_dump(microtime(true)-$s); // dumps the execution time

Not the most efficient thing in the world, but it's going to work. It takes 0.1 seconds on my machine.

Sqlite

PHP should come with sqlite enabled, which will work great for this kind of thing.

This script will create a database for you from start to finish with similar characteristics to the dataset you describe in the question:

<?php
// this will *create* data.sqlite if it does not exist. Make sure "/data" 
// is writable and *not* publicly accessible.
// the ATTR_ERRMODE bit at the end is useful as it forces PDO to throw an
// exception when you make a mistake, rather than internally storing an
// error code and waiting for you to retrieve it.
$pdo = new PDO('sqlite:'.dirname(__FILE__).'/data/data.sqlite', null, null, array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));

// create the table if you need to
$pdo->exec("CREATE TABLE stuff(id TEXT PRIMARY KEY, value TEXT)");

// insert the data
$stmt = $pdo->prepare('INSERT INTO stuff(id, value) VALUES(:id, :value)');
$id = null;
$value = null;

// this binds the variables by reference so you can re-use the prepared statement
$stmt->bindParam(':id', $id);
$stmt->bindParam(':value', $value);

// insert some data (in this case it's just dummy data)
for ($i=0; $i<100000; $i++) {
    $id = $i;
    $value = 'value'.$i;
    $stmt->execute();
}

And then to use the values:

<?php
$s = microtime(true);

$pdo = new PDO('sqlite:'.dirname(__FILE__).'/data/data.sqlite', null, null, array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));

$stmt = $pdo->prepare("SELECT * FROM stuff WHERE id=:id");
$stmt->bindValue(':id', 5);
$stmt->execute();

$value = $stmt->fetchColumn(1);

var_dump($value);

// the number of seconds it took to do the lookup
var_dump(microtime(true)-$s);

This one is waaaay faster. 0.0009 seconds on my machine.

MySQL

You could also use MySQL for this instead of Sqlite, but if it's just one table with the characteristics you describe, it's probably going to be overkill. The above Sqlite example will work fine using MySQL if you have a MySQL server available to you. Just change the line that instantiates PDO to this:

$pdo = new PDO('mysql:host=your.host;dbname=your_db', 'user', 'password', array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));

The queries in the sqlite example should all work fine with MySQL, but please note that I haven't tested this.

Let's get a bit crazy: Filesystem madness

Not that the Sqlite solution is slow (0.0009 seconds!), but this about four times faster on my machine. Also, Sqlite may not be available, setting up MySQL might be out of the question, etc.

In this case, you can also use the file system:

<?php
$s = microtime(true); // more hack benchmarking

class FileCache
{
    protected $basePath;

    public function __construct($basePath)
    {
        $this->basePath = $basePath;
    }

    public function add($key, $value)
    {
        $path = $this->getPath($key);
        file_put_contents($path, $value);
    }

    public function get($key)
    {
        $path = $this->getPath($key);
        return file_get_contents($path);
    }

    public function getPath($key)
    {
        $split = 3;

        $key = md5($key);
        if (!is_writable($this->basePath)) {
            throw new Exception("Base path '{$this->basePath}' was not writable");
        }
        $path = array();
        for ($i=0; $i<$split; $i++) {
            $path[] = $key[$i];
        }
        $dir = $this->basePath.'/'.implode('/', $path);
        if (!file_exists($dir)) {
            mkdir($dir, 0777, true);
        }
        return $dir.'/'.substr($key, $split);
    }
}

$fc = new FileCache('/tmp/foo');

/*
// use this crap for generating a test example. it's slow to create though.
for ($i=0;$i<100000;$i++) {
    $fc->add('key'.$i, 'value'.$i);
}
//*/

echo $fc->get('key1', 'value1');

var_dump(microtime(true)-$s);

This one takes 0.0002 seconds for a lookup on my machine. This also has the benefit of being reasonably constant regardless of the cache size.

like image 76
Shabbyrobe Avatar answered Dec 10 '22 11:12

Shabbyrobe