I am building an email sending service (lambda, Nodejs) which will send email to list of addresses at the same time.
Because of requirement: each address receives a different email content, I cannot send multiple emails in one request, but one by one.
_array = ... // list of address + html content, > 10000 records
for (var i = 0; i < _array.length; i++) {
var params = {
Destination: {/* required */
ToAddresses: [_array[i].email]
},
Message: {/* required */
Body: { /* required */
Html: {
Data: _array[i].html,
Charset: 'UTF-8'
},
},
Subject: { /* required */
Data: 'Your order detail', /* required */
Charset: 'UTF-8'
}
},
Source: "[email protected]"
}
let send = await ses.sendEmail(params).promise();
}
Currently, I don't have that big data to test, but I tested with 100-200 emails, it's working, and take 15-40 secs to complete sending emails. Mathematically, 10000 need more than 25 mins to complete the task. Therefore, this approach is not scalable, because of lambda timeout limit at 15 mins
Any better approach or suggestion is appreciated.
Edit:
The solution from @Thales Minussi is awesome, I implemented it and it's working. I marked it as an answer, but still, I welcome all best practices address this issue. Share, learn, and code happily
NOT RECOMMENDED: What you could do is to parallelise sendEmail
calls. So you could create a big array of Promises and then make use of await Promises.all(yourPromisesArray)
, so Node.js would do its best to optimize the process based on the number of cores available on the machine your Lambda function runs at, meaning to get the best out of it, you'd need to set your Lambda's RAM memory to 3GB (the machine is directly proportional to the amount of RAM, which means the more RAM you set, the better the machine your code runs on). But this is still faulty as now we're talking about 10000 e-mails, but what if this number grows to 100000? 1000000? It's just a solution that doesn't scale enough to meet the demand as it grows, so this is not enough. Another thing is that if things go wrong (like one Promise fails) then it's really hard to recover.
RECOMMENDED: What I suggest instead is that you use SQS to decouple the functions that create the e-mail bodies from the function that actually sends it, so long story short, rather than invoking await ses.sendEmail(params).promise()
as you're doing above, you'd put this message in a SQS queue instead (respecting the 256KB limit per message) and subscribe another Lambda to this SQS queue. Every Lambda can read up to a batch of 10 messages from SQS (and every message can contain many e-mails), which would speed up the process quite significantly, especially because, by default, your Lambda functions will scale out to meet demand.
Let's run a simple math: if you send 100 messages to SQS and every message has 10 e-mails, it means, in the best scenario, 10 Lambdas would spin up, each consuming a batch of 10 messages, but as every message contains 10 e-mails, every Lambda would process 100 e-mails, therefore you'd be processing 1000 e-mails in a blink of an eye!
It's important to note that not all Lambdas will pick up batches of 10 every time, they may pick up smaller batches, so it could happen that more than 10 Lambda functions would spin up simultaneously, but I think you get the idea of parallel processing
EDIT: Since e-mails can be composed of heavy payloads (images, long strings, etc) I suggest you only send the relevant information to the SQS queue to optimize the payload size. If you need images or some pre-defined templates to be processed, just send their respective locations in S3 (or some other storage you may use) so the Lambda that actually is responsible for sending the e-mails is the one that fetches that information and adds it to the body. Essentially, your message to SQS should only contain metadata to keep the payload as lightweight as possible, so you can take advantage from the 256KB limit in your favour. Something like this should be enough to get you off ground:
{
"to": "[email protected]",
"images": ["s3://bucket/image.jpg", "s3://bucket/image2.jpg"],
"template": "s3://bucket/template.html"
}
Achitectural Flow
Too late but here's the architecture I came up with. A cloudwatch rule triggers at a specific time. It invokes a lambda. The lambda gets the count of all the emails to be sent at this particular time, chunks it up to chunks of 10 and then for each chunk of 10 it publishes to the topic of SNS "send_email_topic" which further invokes a lambda which sends the email. Each lambda invoked from SNS will recieve the chunk size which is 10 in our case and offset value. Invoked lambda hits the DB and get the rows of emails to be sent given the chunk size as limit and offset (offset to avoid different lambdas from sending the same email ) and send the email. Each invoked lambda will send up-to 10 Emails.
Failure
In case of failed emails the email-id will be sent to the queue. SQS on receiving the message will publish to SNS on a different topic "retry_email_send" which will again invoke a lambda but a slightly different lambda which receives the email id of email to retried, fetched the single email from the DB and retries it
Key Notes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With