Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How big can a php array be before memory concerns should be raised?

Our app currently works like this:

class myClass{

    private $names = array();

    function getNames($ids = array()){
         $lookup = array();

         foreach($ids as $id)
             if (!isset($this->names[$id]))
                $lookup[] = $id;

         if(!empty($lookup)){
              $result;//query database for names where id in $lookup
                      // now contains associative array of id => name pairs
              $this->names = array_merge($this->names, $result);
         }

         $result = array();
         foreach($ids as $id)
             $result[$id] = $this->names[$id];

         return $result;
    }
}

Which works fine, except it can still (and often does) result in several queries (400+ in this instance).

So, I am thinking of simply querying the database and populating the $this->names array with every name from the database.

But I am concerned about how many entries in the database I should start worrying about memory when doing this? (database column is varchar(100))

like image 669
Hailwood Avatar asked Aug 29 '12 04:08

Hailwood


People also ask

Does PHP need more memory than Python?

Granted, Python needs about a tenth as much memory for the same value space but then again PHP arrays can be indexed by value, by key, ground in a coffee-grinder and jettisoned into space. PHP was never a language for minimalists, so why do you set up that expectation? Nonetheless an interesting adventure into the internals of PHP!

Why do we need to store loads of data in PHP?

As you can see one needs to store loads of data to get the kind of abstract array data structure that PHP uses (PHP arrays are arrays, dictionaries and linked lists at the same time, that sure needs much info).

What is the best memory manager for PHP?

For this purpose PHP uses a custom memory manager that is optimized specifically for its needs: The Zend Memory Manager. The Zend MM is based on Doug Lea’s malloc and adds some PHP specific optimizations and features (like memory limit, cleaning up after each request and stuff like that).

Does Suhosin increase the size of memory allocation header?

One of the things suhosin does is hardening the memory manager, so I could well imagine that it adds some more info into the allocation header (would need to add 32 bytes to it to account for the full difference). Suhosin is not to blame. The original test allocation comes up with exactly the same 14649064 bytes


1 Answers

How much memory do you have? And how many concurrent users does your service generally support during peak access times? These are pertinent pieces of information. Without them any answer is useless. Generally, this is a question easily solved by load testing. Then, find the bottlenecks and optimize. Until then, just make it work (within reason).

But ...

If you really want an idea of what you're looking at ...

If we assume you aren't storing multibyte characters, you have 400 names * 100 chars (assume every name maxes your char limit) ... you're looking at ~40Kb of memory. Seems way too insignificant to worry about, doesn't it?

Obviously you'll get other overhead from PHP to hold the datastructure itself. Could you store things more efficiently using a data structure like SplFixedArray instead of a plain array? Probably -- but then you're losing the highly optimized array_* functions that you'd otherwise have to manipulate the list.

Will the user be using every one of the entries you're planning to buffer in memory? If you have to have them for your application it doesn't really matter how big they are, does it? It's not a good idea to keep lots of information you don't need in memory "just because." One thing you definitely don't want to do is query the database for 4000 records on every page load. At the very least you'd need to put those types of transactions into a memory store like memcached or use APC.

This question -- like most questions in computer science -- is simply a constrained maximization problem. It can't be solved correctly unless you know the variables at your disposal.

like image 94
rdlowrey Avatar answered Oct 12 '22 23:10

rdlowrey