Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Link two arrays and use them like MySQL DB

Tags:

php

Im trying to help a mate and shes sticking to arrays so no database use this time :( The countys page shows all the countys and when one is clicked it is taken to a different page displaying the walks for that county.

$countys = array();
$countys[101] = array(
                "name" => "Armagh",
                "img" => "css/images/map.jpg",
                "largeimg" => "css/images/banmap.jpg"
            );

$countys[102] = array(
                "name" => "Antrim",
                "img" => "css/images/map.jpg",
                "largeimg" => "css/images/banmap.jpg"
            );

$walks = array();
$walks[1] = array(
    "name" => "Portadown Walk",
    "county" => "Armagh",
    "img" => "css/images/map.jpg",
    "location" => "Portadown", 
    "largeimg" => "css/images/banmap.jpg"
);

$walks[2] = array(
    "name" => "Antrim Walk",
    "county" => "Antrim",
    "img" => "css/images/map.jpg",
    "location" => "Causeway"
);

Would a multidimensional array work ok or maybe a for/while loop to check if the $walk['county'] is equal to the county?

like image 934
Pierce McGeough Avatar asked Dec 04 '22 11:12

Pierce McGeough


1 Answers

The tables

If you want to use an array as a database, fairly obviously the best way to model a table is using a 2D array:

$counties = array();
$countiesKey = 0;

// add a row
$counties[++$countiesKey] = array(
    "name"     => "Armagh",
    "img"      => "css/images/map.jpg",
    "largeimg" => "css/images/banmap.jpg"
);
// and another...
$counties[++$countiesKey] = array(
    "name"     => "Antrim",
    "img"      => "css/images/map.jpg",
    "largeimg" => "css/images/banmap.jpg"
);

This is roughly equivalent to the following table definition (for the sake of simplicity we'll use MySQL for the comparison and assume all string fields to be VARCHAR(1024)):

CREATE TABLE counties (
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(1024),
    img VARCHAR(1024),
    largeimg VARCHAR(1024)
);

Unique indexes

So we're using the array index as our primary key. But in order to search the database based on any "column" other than the primary key it would require an O(n) operation: we need to iterate the entire table and inspect the relevant value of every row. This is where indexes come into play. What if we wanted to add an index on our county name as well? Well, we can use a separate associative array:

$countiesNameIndex = array();
$countiesNameIndex['Armagh'] = 1;
$countiesNameIndex['Antrim'] = 2;

Associative arrays are implemented as a hash table, so accessing an element by key is roughly O(1). This allows us to drastically speed up access to rows when we are search by county name:

$search = 'Antrim';
$result = array();
if (isset($countiesNameIndex[$search])) {
    $result[$countiesNameIndex[$search]] = $counties[$countiesNameIndex[$search]];
}
return $result;

This index can be maintained dynamically as rows are added and removed:

// Insert a row
$row = array( /* row data */ );
if (isset($countiesNameIndex[$row['name']])) {
    // insert fails, duplicate value in column with unique index
}
$counties[++$countiesKey] = $row;
$countiesNameIndex[$row['name']] = $countiesKey;

// Delete a row
$idOfRowToDelete = 2;
if (isset($counties[$idOfRowToDelete])) {
    unset(
        $countiesNameIndex[$counties[$idOfRowToDelete]['name']], 
        $counties[$idOfRowToDelete]
    );
}

This indexing approach will drastically speed up data access as the data set gets larger.


Non-clustered indexes

Let's just quickly look at how we would implement a non-unique index that does not carry information about order of the rows it references - the implementation is very similar. This would be a little slower than our unique index, but noticeably faster than iterating the entire data set:

$countiesImgIndex = array();

// Insert a row
// INSERT INTO counties ( ... ) VALUES ( ... )
$row = array( /* row data */ );
if (!isset($countiesImgIndex[$row['img']])) {
    $countiesImgIndex[$row['img']] = array();
}
$counties[++$countiesKey] = $row;
$countiesImgIndex[$row['img']][] = $countiesKey;

// Search using the index
// SELECT * FROM counties WHERE img = 'css/images/map.jpg'
$search = 'css/images/map.jpg';
$result = array();
if (isset($countiesImgIndex[$search])) {
    foreach ($countiesImgIndex[$search] as $rowId) {
        $result[$rowId] = $counties[$rowId];
    }
}
return $result;

// Delete a row
// DELETE FROM counties WHERE id = 2
$idOfRowToDelete = 2;
if (isset($counties[$idOfRowToDelete])) {
    $key = array_search($idOfRowToDelete, $countiesImgIndex[$counties[$idOfRowToDelete]['img']]);
    if ($key !== false) {
        array_splice($countiesImgIndex[$counties[$idOfRowToDelete]['img']], $key, 1);
    }
    unset($counties[$idOfRowToDelete]);
}

Using multiple indexes

We can even use these indexes to perform more complex operations - consider how to implement the SQL query

SELECT *
FROM counties
WHERE name = 'Antrim'
  AND img  = 'css/images/map.jpg'

Firstly we look at the most specific index (the unique index):

$result = array();

$nameSearch = 'Antrim';
$imgSearch = 'css/images/map.jpg';

if (!isset($countiesNameIndex[$nameSearch])) {
    return $result;
}

Next we check if that row matches the other condition:

if ($counties[$countiesNameIndex[$nameSearch]]['img'] === $imgSearch) {
    $result[$countiesNameIndex[$nameSearch]]
        = $counties[$countiesNameIndex[$nameSearch]];
}

return $result;

You can see that in this case, we only needed to use 1 index, because one of the columns being queried has a unique index. This means that we can go straight to the only row that matters and check that it matches the condition. Now lets imagine that we have an index on another, non unique column - largeImg. This operation is a little more complex, but we can take a shortcut by using array_intersect():

$result = array();

$imgSearch = 'css/images/map.jpg';
$largeImgSearch = 'css/images/banmap.jpg';

if (!isset($countiesImgIndex[$imgSearch], $countiesLargeImgIndex[$largeImgSearch])) {
    return $result;
}

return array_intersect(
    $counties[$countiesImgIndex[$imgSearch]], 
    $counties[$countiesLargeImgIndex[$largeImgSearch]]
);

Foreign keys and joining tables

But what about when we start wanting to join with another table? Well again, it's much like we'd do it in SQL. Let's imagine we have the the following SQL table definition:

CREATE TABLE walks (
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(1024),
    location VARCHAR(1024),
    county INT
);

Obviously we start with another array, and insert some rows:

$walks = array();
$walksKey = 0;

$walks[++$walksKey] = array(
    "name" => "Portadown Walk",
    "county" => 1,
    "location" => "Portadown", 
);
$walks[++$walksKey] = array(
    "name" => "Antrim Walk",
    "county" => 2,
    "location" => "Causeway"
);

Pretty obvious what's going on there, the county column references the ID of the row in the $counties table. Incidentally, the reason we use a counter to track the ID instead of using $arr[] = assignment syntax is two-fold: firstly, it ensures that IDs will always be constant when rows are deleted from the table, and secondly it makes it easier (less computationally expensive) to extract the ID of the last inserted row - which would help when creating a complex table structure with foreign keys, as you see here.

Now let's look at relating this data together. Imagine we ran this SQL query:

SELECT c.*, w.*
FROM walks w
JOIN counties c ON w.county = c.id
LIMIT 0, 10

This can be implemented as follows:

$result = array();
$i = 0;
foreach ($walks as $walkId => $walksRow) {
    $result[$walkId] = array_merge($counties[$walksRow['county']], $walksRow);
    if (++$i == 10) {
        break;
    }
}
return $result;

Now you may have spotted an issue with this: both tables contain a column called name. The above code would return the value of name from the walks table for each row. You can easily adjust this behaviour, but exactly how this would be implemented would depend on the result you want.


Ordering result sets

PHP provides a function that does most of the work for you here - array_multisort(). The most important point to note is that the order should be applied after you have extracted the result rows, to minimise the number of required operations.

SELECT c.*, w.*
FROM walks w
JOIN counties c ON w.county = c.id
ORDER BY w.location ASC
// Collect the result set in $result as above

$location = array();

foreach ($result as $row) {
    $location[] = $row['name'];
}
array_multisort($location, SORT_ASC, $result);

return $result;

Hopefully the above examples should begin to demonstrate some of the logic that can be used to implement some of the features on an RDBMS using PHP arrays. Certain fairly simple optimisations can be made that will keep these operations relatively inexpensive even as the data set grows.

like image 101
DaveRandom Avatar answered Dec 09 '22 15:12

DaveRandom