I'm building a web application using Laravel 5 that creates links to the web application that when visited show a form for a student's progress report. These links are sent by the web application to an email of a contact at the institution the student attends in order for the recipient to complete the progress report accessed by the link in the email.
The problem I face is when creating and sending links. I have some code which works fine with a couple of hundred students, however in real world use the application would be potentially creating and sending 3000+ or so links at one time. The code I've written simply can't handle such a large number in a timely manner and the application crashes. Strangely enough though I don't receive any timeout error via laravel (I'll need to double check the php logs).
Although I'm more than welcome to other suggestions, I believe the answer to the problem is utilising queues. I have already used queues when sending email (see code), but I would like to work some other parts of the code into queues, but I'm a bit unsure how to do this!
Brief database schema
Student
hasMany Link
Student
hasMany InstitutionContact
(limited to two by my application)
Link
hasMany InstitutionContact
(limited to two by my application)
Email
manyToMany Link
What I am trying to accomplish
Get all the Student
's that require a new Link
Create a Link
for each Student
Assign the Student
's current InstitutionContact
s to the Link
's InstitutionContact
(A Student
's institution contact can change, so I link the InstitutionContact
to the link if needed to resend.
Loop through all the newly created Links
in order to group them together by shared InstitutionContact
s - this is so an email is not sent per link (thus possibly sending many emails with one link to the same address), rather links should be grouped together by the same email/contact and sent together where applicable
Loop through all the Link
s grouped by email/contact and:
Link
's info (url, student name etc) to the designated InstitutionContact
's email addressEmail
to the databaseEmail
created in the former step to the Link
(s) that were sent in it (so the application can be used to search what link was sent in which email)So the main challenge I'm facing is performing the aforementioned task with a large dataset. I have already considered creating and sending a Link
one by one via a queue, however this wouldn't allow me to group all the Link
s together by contact/email. As the task wouldn't be performed regularly I would be open to consider performing the task as it is with an increase in memory and time for the process, however I didn't have much success when attempting this using set_time_limit(0);
and ini_set('memory_limit','1056M');
before sending any links.
Any help would be really appreciated, thank you if you read this far!
Code
app\Http\Controllers\LinkController.php
public function storeAndSendMass(Request $request)
{
$this->validate($request, [
'student_id' => 'required|array',
'subject' => 'required|max:255',
'body' => 'required|max:5000',
]);
$studentIds = $request->get('student_id');
$subject = $request->get('subject');
$body = $request->get('body');
$students = $this->student
->with('institutionContacts')
->whereIn('id', $studentIds)
->where('is_active', 1)
->get();
// create link, see Link.php below for method
$newLinks = $this->link->createActiveLink($students);
// send link to student's contact(s), see LinkEmailer.php below for method
$this->linkEmailer->send($newLinks, ['subject' => $subject, 'body' => $body], 'mass');
// return
return response()->json([
'message' => 'Creating and sending links'
]);
}
app\Models\Link.php
public function createActiveLink($students)
{
$links = [];
foreach ($students as $student) {
$newLink = $this->create([
'token' => $student->id, // automatically hashed
'status' => 'active',
'sacb_refno' => $student->sacb_refno,
'course_title' => $student->course_title,
'university_id' => $student->university_id,
'student_id' => $student->id,
'institution_id' => $student->institution_id,
'course_id' => $student->course_id,
]);
$studentContacts = $student->institutionContacts;
if ($studentContacts) {
foreach ($studentContacts as $studentContact) {
$newLink->contacts()->create([
'type' => $studentContact->pivot->type,
'institution_contact_id' => $studentContact->pivot->institution_contact_id
]);
$newLink->save();
}
}
$links[] = $newLink->load('student');
}
return $links;
}
app\Emails\LinkEmailer.php
namespace App\Emails;
use App\Emails\EmailComposer;
class LinkEmailer
{
protected $emailComposer;
public function __construct(EmailComposer $emailComposer)
{
$this->emailComposer = $emailComposer;
}
public function send($links, $emailDetails, $emailType)
{
$contactsAndLinks = $this->arrangeContactsToLinks($links);
foreach ($contactsAndLinks as $linksAndContact) {
$emailData = array_merge($linksAndContact, $emailDetails);
// send/queue email
\Mail::queue('emails/queued/reports', $emailData, function ($message) use ($emailData) {
$message
->to($emailData['email'], $emailData['formal_name'])
->subject($emailData['subject']);
});
// compose email message, returns text of the email
$emailMessage = $this->emailComposer->composeMessage($emailData);
// // create Email
$email = \App\Models\Email::create([
'to' => $emailData['email'],
'from' => '[email protected]',
'subject' => $emailData['subject'],
'body' => $emailMessage,
'type' => $emailType,
'user' => $_SERVER['REMOTE_USER']
]);
foreach ($linksAndContact['links'] as $link) {
$link->emails()->attach($email->id);
}
}
}
// group links by contact
public function arrangeContactsToLinks($links)
{
$contactsForLinks = [];
$assigned = false;
$match = false;
foreach ($links as $link) { // 1, n
if ($link->contacts) {
foreach ($link->contacts as $contact) { // 1, 2
if ($contactsForLinks) {
$assigned = false;
foreach ($contactsForLinks as $key => $contactLink) { // n
// assign links to existing email in array
if ($contactLink['email'] === $contact->institutionContact->email) {
$match = false;
// check link hasn't already been included
foreach ($contactsForLinks[$key]['links'] as $assignedLink) {
if ($assignedLink === $link) {
$match = true;
}
}
// if there was no match add to list of links
if (!$match) {
$contactsForLinks[$key]['links'][] = $link->load('student');
$assigned = true;
break;
}
}
}
if (!$assigned) {
$contactsForLinks[] = [
'email' => $contact->institutionContact->email,
'formal_name' => $contact->institutionContact->formal_name,
'requires_id' => $contact->institutionContact->institution->requires_id,
'requires_course_title' => $contact->institutionContact->institution->requires_course_title,
'links' => [$link->load('student')],
];
}
} else {
$contactsForLinks[] = [
'email' => $contact->institutionContact->email,
'formal_name' => $contact->institutionContact->formal_name,
'requires_id' => $contact->institutionContact->institution->requires_id,
'requires_course_title' => $contact->institutionContact->institution->requires_course_title,
'links' => [$link->load('student')],
];
}
}
}
}
return $contactsForLinks;
}
}
Edit 1
I've got this working now with set_time_limit(0);
and ini_set('memory_limit','1056M');
it took 8 mins to do 3000 students.
Edit 2
I'm running Laravel Framework version 5.1.6 (LTS), MySQL for DB.
Edit 3
Appreciate all the answers so far, thank you all. I am thinking that I may work the link
creation process into a queue which will have a related entity in the database called something like Batch
and when that Batch
of links has finished being created then group all the Link
s from that Batch
and send them.
I could use the approach that @denis-mysenko suggested by having a sent_at
field in the Link
s table and have a scheduled process to check for Link
s that haven't been sent and then send them. However, using the aforementioned approach I can send the Batch
of Link
s when they've all finished being created, whereas with the sent_at
approach with a scheduled process looking for Link
s that haven't been sent it could potentially send some links when all the links haven't been created yet.
If you've tested your code with a small amount of data and it succeeds without crashing, it's clear that the problem is (as you said) the quite high number of records you're dealing with. Why don't you process your collection with the chunk method?
Accordingly to the Laravel docs:
If you need to process thousands of Eloquent records, use the chunk command. The chunk method will retrieve a "chunk" of Eloquent models, feeding them to a given Closure for processing. Using the chunk method will conserve memory when working with large result sets
Anyway, I think the use of a queue it's required in this kind of scenarios. I believe that working on a large set of data on a HTTP request should be absolutely avoided due to the high risk of request timeout. The queued process doesn't have the limit of execution time.
Why don't you use the queue AND the chunk method on your collection together? This will make you able to:
The Laravel docs covers all you need: Eloquent - Retrieving multiple models (see the "Chunking results" chapter for going deeper on how to save memory when dealing with a large data set) and Queues for creating jobs and detach some parts of your software that should not run under your webserver, avoiding the risk of timeouts
I would propose to change the architecture. I think it's unnecessarily overcomplicated.
Controller could like like:
public function storeAndSendMass(Request $request, LinkEmailer $linkEmailer)
{
$this->validate($request, [
'student_id' => 'required|array',
'subject' => 'required|max:255',
'body' => 'required|max:5000',
]);
$students = $this->student
->with('institutionContacts')
->whereIn('id', $request->input('student_id'))
->where('is_active', 1)
->get();
// Don't use Link.php method at all
foreach ($students as $student) {
$student->links()->create([
'token' => $student->id, // automatically hashed
'status' => 'active',
'sent_at' => null,
'email_body' => $request->input('body'),
'email_subject' => $request->input('subject')
]);
}
return response()->json([
'message' => 'Creating and sending links'
]);
}
Why keep so many fields in Link model that already exist in Student model and are accessible via student() relationship? You could just keep the status and the token (I assume it's part of the link?), as well as 'sent_at' timestamp. If links are usually sent only once, it's reasonable to keep the email body and subject there as well.
If student updates his institution contacts, fresh data will be used at the time of email composition because you are not linking links to institution contacts explicitly.
Then, I would create a command (let's say newLinkNotifier) that would run, for instance, every 10 minutes and that would look for links that haven't been sent yet ($links->whereNull('sent_at')
), group them by email ($link->student->institutionContacts
) and email content ($link->email_body, $link->email_subject
) and create queued email jobs. Then a queue worker would send those emails (or you could set queue to 'async' to send them right away from the command).
Since this command will run async, it doesn't really matter if it takes 5 minutes to finish. But in real life, it would take less than a minute for thousands and thousands of objects.
How to do the grouping? I would probably just rely on MySQL (if you are using it), it will do the job faster than PHP. Since all 3 fields are accessible from SQL tables (two directly, another from JOIN) - it's actually a pretty simple task.
In the end your emailers send() method will become as trivial as:
public function send()
{
$links = Link::whereNull('sent_at')->get();
// In this example, we will group in PHP, not in MySQL
$grouped = $links->groupBy(function ($item, $key) {
return implode('.', [$item->email, $item->email_body, $item->email_subject]);
});
$grouped->toArray();
foreach ($grouped as $group) {
// We know all items in inside $group array got the same
// email_body, email, email_subject values anyway!
Mail::queue('emails/queued/reports', $group[0]->email_body, function ($message) use ($group) {
$message
->to($group[0]->email)
->subject($group[0]->email_subject);
});
}
}
This is not perfect yet, and I haven't tested this code - I wrote it right here, but it shows the proposed concept.
If you don't plan to get millions of entries - this is probably good enough. Otherwise, you could move link creation into a separate async job as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With