Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP creating a multidimensional array of message threads from a multidimensional array (IMAP)

Tags:

php

email

imap

My question is the following:

If you look below you'll see there is a datastructure with message ids and then the final datastructure containing the message details which should be aggregated from imap_fetch_overview. The message ids are from imap_thread. The problem is its not putting the email details in the position where the message id is.

Here is my datastructure:

[5] => Array
    (
        [0] => 5
        [1] => 9
    )

[10] => Array
    (
        [0] => 10
        [1] => 11
    )

What I'd like to have is:

[5] => Array
    (
        [0] => messageDetails for id 5
        [1] => messageDetails for id 9
    )

[10] => Array
    (
        [0] => messageDetails for id 10
        [1] => messageDetails for id 11
    )

Here is the code I have thus far:

$emails = imap_fetch_overview($imap, implode(',',$ids));

// root is the array index position of the threads message, such as 5 or 10
foreach($threads as $root => $messages){

    // id is the id being given to us from `imap_thread`
    foreach($message as $key => $id){

      foreach($emails as $index => $email){

         if($id === $email->msgno){
             $threads[$root][$key] = $email;
             break;
          }
      }
    }
 }

Here is a printout from one of the $emails:

    [0] => stdClass Object
    (
        [subject] => Cloud Storage Dump
        [from] => Josh Doe
        [to] => [email protected]
        [date] => Mon, 21 Jan 2013 23:18:00 -0500
        [message_id] => <[email protected]>
        [size] => 2559
        [uid] => 5
        [msgno] => 5
        [recent] => 0
        [flagged] => 0
        [answered] => 1
        [deleted] => 0
        [seen] => 0
        [draft] => 0
        [udate] => 1358828308
    )

If you notice, the msgno is 5 which corrolates to the $id, so technically the data should be populating into the final datastructure.

Also, this seems like an inefficient way to handle this.

Please let me know if I you need any additional clarification.

UPDATE CODE

This code is a combination of code I found on php api and some fixes by me. What I think is problematic still is the $root.

$addedEmails = array();
$thread = imap_thread($imap);
foreach ($thread as $i => $messageId) { 
    list($sequence, $type) = explode('.', $i); 
    //if type is not num or messageId is 0 or (start of a new thread and no next) or is already set 
   if($type != 'num' || $messageId == 0 || ($root == 0 && $thread[$sequence.'.next'] == 0) || isset($rootValues[$messageId])) { 
    //ignore it 
    continue; 
} 

if(in_array($messageId, $addedEmails)){
    continue;
}
array_push($addedEmails,$messageId);

//if this is the start of a new thread 
if($root == 0) { 
    //set root 
    $root = $messageId; 
} 

//at this point this will be part of a thread 
//let's remember the root for this email 
$rootValues[$messageId] = $root; 

//if there is no next 
if($thread[$sequence.'.next'] == 0) { 
    //reset root 
    $root = 0; 
    } 
  }
$ids=array();
$threads = array();
foreach($rootValues as $id => $root){
    if(!array_key_exists($root,$threads)){
        $threads[$root] = array();
    }
    if(!in_array($id,$threads[$root])){
        $threads[$root][] = $id;
       $ids[]=$id;
    }
 }
 $emails = imap_fetch_overview($imap, implode(',', array_keys($rootValues)));

 $keys = array();
 foreach($emails as $k => $email)
 {
$keys[$email->msgno] = $k;
 }

 $threads = array_map(function($thread) use($emails, $keys)
{
// Iterate emails in these threads
return array_map(function($msgno) use($emails, $keys)
{
    // Swap the msgno with the email details
    return $emails[$keys[$msgno]];

}, $thread);
}, $threads);
like image 761
somejkuser Avatar asked Apr 27 '13 04:04

somejkuser


2 Answers

Remember that in php whatever function you use it will be finally converted to some sort of loop. There are, however some steps you could take to make it more efficient and they are different in PHP 5.5 and in 5.3/5.4.

PHP 5.3/5.4 way

The most efficient way of doing this would be to split the function to 2 separate steps. In first step you would generate a map of keys for the list of emails.

$keys = array();
foreach($emails as $k => $email)
{
    $keys[$email->msgno] = $k;
}

In 2nd step you iterate all values in the multi-dimensional $threads and replace them with the email details:

// Iterate threads
$threads = array_map(function($thread) use($emails, $keys)
{
    // Iterate emails in these threads
    return array_map(function($msgno) use($emails, $keys)
    {
        // Swap the msgno with the email details
        return $emails[$keys[$msgno]];

    }, $thread);

}, $threads);

Proof of concept: http://pastebin.com/rp5QFN4J

Explanation of keyword use in anonymous functions:

In order to make use of variables defined in the parent scope, it is possible to import variables from the parent scope into the closure scope with the use () keyword. Although it was introduced in PHP 5.3 it hasn't been documented in the official PHP manual yet. There's only a draft document on php's wiki here https://wiki.php.net/rfc/closures#userland_perspective

PHP 5.5

One of the new features in this version enables you to use generators, which have significantly smaller memory thumbprint thus are more efficient.

Explanation of keyword yield in generators:

The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.

1st step:

function genetateKeyMap($emails)
{
    foreach($emails as $k => $email)
    {
        // Yielding key => value pair to result set
        yield $email->msgno => $k;
    }
};
$keys = iterator_to_array(genetateKeyMap($emails));

2nd step:

function updateThreads($emails, $threads, $keys)
{
    foreach($threads as $thread)
    {
        $array = array();

        // Create a set of detailed emails
        foreach($thread as $msgno)
        {
            $array[] = $emails[$keys[$msgno]];
        }

        // Yielding array to result set
        yield $array;
    }
};
$threads = iterator_to_array(updateThreads($emails, $threads, $keys));

A few words about the values being returned by genrators:

Generators return an object which is an instance of SPL Iterator thus it needs to use iterator_to_array() in order to convert it into exactly the same array structure your code is expecting. You don't need to do this, but it would require an update of your code following the generator function, which could be even more efficient.

Proof of concept: http://pastebin.com/9Z4pftBH

Testing Performance:

I generated a list of 7000 threads with 5 messages each and tested the performance of each method (avg from 5 tests):

                   Takes:       Memory used:
                   ----------------------------
3x foreach():      2.8s              5.2 MB
PHP 5.3/5.4 way    0.061s            2.7 MB
PHP 5.5 way        0.036s            2.7 MB

Although the results on your machine/server might be different but the overview shows that the 2-step method is around 45-77 times faster than using 3 foreach loops

Test script: http://pastebin.com/M40hf0x7

like image 66
WooDzu Avatar answered Nov 18 '22 00:11

WooDzu


When you print_r the $emails array what structure you get? Maybe the below should do it?

 $threads[$root][$key] = $emails[$key];
like image 38
georgec20001 Avatar answered Nov 18 '22 00:11

georgec20001